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The  problem  investigated  in  this  study  is  that  of  evaluating  psy¬ 
chological  tests  as  aids  to  the  selection  of  personnel  for  training  and 
jobs.  When  an  institution  uses  a  test  for  the  purpose  of  personnel 
selection,  some  estimate  of  its  value  as  a  decision-making  tool  is  needed 
by  psychologists  and  management.  The  conventional  approach  to  test  evalu¬ 
ation,  namely,  correlational  analysis,  ignores  three  important  situational 
factors:  hov  well  the  institution  could  do  by  chance  (commonly  called  the 
"base  rate"),  the  proportion  to  be  selected  from  the  population  (the  se¬ 
lection  ratio),  and  the  institutional  gains  and  losses  resulting  from 
correct  decisions  and  incorrect  decisions. 

A  method  based  on  statistic  decision  theory  was  developed  which 
handles  these  factors  explicitly  and  systematically.  1516  method,  as 
presented,  is  restricted  to  the  dichotomous  (or  dichotomized)  criterion 
case  and  does  not  rely  on  the  correlation  coefficient  as  an  index  of 
association  between  the  test  and  the  criterion.  The  decision- theoretic 
method  involves  the  construction  of  a  payoff  matrix  corresponding  to  the 
contingency  table  relating  the  test  to  the  criterion.  Ihe  cell  frequencies 
are  weighted  in  a  utility  equation  by  the  payoff  values  (utilities)  in  the 


corresponding  cells  of  the  payoff  matrix.  Ihis  utility  equation  re presen 
a  nev  test  evaluation  index  that  directly  expresses  the  utility  of  the  tet 
to  the  institution  using  it. 

Also  presented  is  a  method  based  on  Brogden's  publications  on  thii 
problem.  It  involves  the  comparison  of  criterion  groups,  e.g.,  satis¬ 
factory  and  unsatisfactory,  in  terms  of  their  utility  to  the  institution 
using  the  selection  test.  It  is  called  the  "utility  function"  method 
since  the  criterion  is  converted  to  a  utility  &cale. 

The  three  methods  (correlational,  decision-theoretic,  and  utility 
function)  were  compared  with  tests  used  to  select  students  for  technical 
schools  in  the  U.  S.  Navy.  Scaling  techniques  were  developed  for  the 
measurement  of  values  inherent  in  the  Navy  situation.  Specifically,  the 
graduate -fail  criterion  was  translated  to  a  utility  scale  and  the  corre¬ 
sponding  job  areas  were  scaled  on  need  (or  the  relative  utility  of  gradu¬ 
ates  to  the  Navy).  Using  scale  values  obtained  for  the  job  areas,  a  pay¬ 
off  matrix  was  constructed  for  each  school  on  the  assumption  that  the 
currently  used  test  cutoffs  are  optimal. 

The  three  methods  led  to  quite  different  indications  regarding  tht 
utility  of  the  selection  tests  evaluated.  The  decision- theoretic  and 
utility  function  methods  agreed  in  terms  of  the  proportion  improvement 
over  chance  prediction  provided  by  the  tests,  while  the  :orrelational 
method  tended  to  underestimate  this  proportion.  In  terms  of  utility,  the 
decision- theoretic  method  indicated  the  tests  were  worth  much  more  to  the 
Navy  than  did  the  other  two  methods. 

In  addition  to  the  above,  the  following  conclusions  were  stated: 
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(l)  Statistical  decision  theory  is  veil  suited  to  the  usual  selection 
testing  situation.  (2)  Psychological  scaling  methods  provide  a  solution 
for  the  measurement  of  values  required  in  the  application  of  the  decision- 
theoretic  approach  tc  test  evaluation.  (3)  Supplementation  of  corre¬ 
lational  analysis  of  tests  with  decision- theoretic  analysis  is  likely  to 
lead  to  new  insights  into  the  utility  and  use  of  tests  for  personnel 
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CHAPTER  I 

STATEMENT  OF  THE  PROBLEM  AND  THEORETICAL  BACKGROUND 

The  problem  Investigated  in  this  study  is  that  of  evaluating 
psychological  tests  as  aids  to  the  selection  of  personnel  for  training 
and  Jobs.  When  an  institution  uses  a  test  for  the  purpose  of  personnel 
selection,  some  estimate  of  its  value  as  a  decision-making  tool  is 
needed  by  psychologists  and  management.  The  conventional  approach  to 
test  evaluation,  namely,  correlational  analysis,  ignores  three  impor¬ 
tant  situational  factors:  how  veil  the  institution  could  do  by  chance 
(commonly  called  the  ’'base  rate"),  the  proportion  to  be  selected  from 
the  population  (the  selection  ratio),  and  the  institutional  gains  and 
losses  resulting  from  correct  decisions  and  incorrect  decisions. 

In  an  attempt  to  contribute  to  more  adequate  test  evaluation, 
three  tasks  are  undertaken  in  this  study: 

(1)  Demonstration  of  the  need  for  a  new  approach  to  selection 
test  evaluation. 

(2)  Development  of  a  mathematically  rigorous  yet  practical 
approach  to  selection  test  evaluation  which  explicitly  utilizes  infor¬ 
mation  about  the  base  rate,  selection  ratio,  and  institutional  gains 
and  losses. 

(3)  Empirical  tryout  of  the  test  evaluation  approach  developed 
in  this  study. 

It  is  assumed  throughout  that  to  "evaluate"  a  test  means  to 


10 


determine  its  value  for  a  specific  decision  in  a  specific  applied  situ¬ 
ation. 

Background  of  two  Diverse  Approaches  to  Test  Evaluation 
The  Conventional  Approach 

Personnel  tests  are  typically  evaluated  by  determining  the 
correlation  between  the  test  and  a  criterion,  usually  some  measure  of 
performance.  The  resultant  coefficient  is  commonly  called  the  validity 
coefficient.  Several  indices  have  been  developed  for  interpreting  va¬ 
lidity  coefficients;  the  one  having  the  longest  history  is  the  "index 
of  forecasting  efficiency,"  E: 

E  -  1  -  A  -  r2, 

where  r  is  the  correlation  between  the  predictor  and  the  criterion. 

This  index  compares  the  standard  error  of  criterion  scores  predicted 
by  means  of  the  test  to  the  standard  error  of  chance  estimates.  The 
proportionate  reduction  of  the  standard  error  is  taken  as  a  measure  of 
the  value  of  the  test. 

2 

The  "coefficient  of  determination,"  r  ,  is  another  index  that 
is  used  to  evaluate  tests.  This  index  expresses  the  ratio  of  predicted 
variance  in  the  criterion  to  the  total  variance.  Use  of  this  index  and 
the  index  of  forecasting  efficiency  requires  that  the  correlation  be 
reasonably  high  {about  .50)  in  order  to  conclude  that  the  test  is  sub¬ 
stantially  beneficial.  The  index  of  forecasting  efficiency  describes 
a  test  with  such  validity  as  predicting  only  13  per  cent  better  than 
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chance,  while  the  coefficient  of  determination  describes  such  a  test  a6 
accounting  for  25  per  cent  of  the  variance  in  the  criterion. 

Tne  major  variation  on  this  approach  is  due  to  Brogden  (1946). 

He  demonstrated  mathematically- -through  manipulation  of  the  formulas 

2 

for  r--that  r,  not  E  or  r  ,  is  a  direct  measure  of  the  proportion 
improvement  over  chance  prediction  afforded  by  a  selection  te6t.  Thus, 
an  r  of  .05  indicates  that  the  test  provides  five  per  cent  of  the  improve¬ 
ment  over  chance  that  a  perfect  test  would  provide;  an  r  of  .50,  50  per 
cent;  an  r  of  <95,  95  per  cent.  This  means,  if  the  correlational  approach 
is  valid,  that  the  units  on  the  r  scale  are  equal  in  value  to  the  insti¬ 
tution  using  the  test,  a  great  departure  from  the  implications  of  E  and 
2 

r  that  the  units  near  1.00  are  much  more  important  than  the  units  near 
zero.  (For  example,  E  is  .0u4  greater  for  an  r  of  .10  than  for  an  r  of 
.05,  while  it  is  .12  greater  for  an  r  of  .95  than  for  an  r  of  .90.  This 
implies  that  the  units  between  *90  and  .95  sre  30  times  as  important  to 
the  institution  a6  the  units  between  .05  and  .10.) 

Subsequently,  Brogden  (19^9)  developed  an  index  of  selection 
test  value  that  avoided  some  of  the  restrictive  assumptions  of  r, 
namely,  normal  distributions  and  linear  regression.  When  the  empirical 
data  conform  to  these  assumptions,  Brogden' s  index  theoretically  equals 
r.  He  also  advocated  use  of  utility  scales  as  criteria  in  place  cf  con¬ 
ventional  measures  of  performance. 

Chapter  II  is  devoted  to  pointing  out  some  of  the  limitations 
of  the  correlational  approach.  A  method  based  on  Brogden 's  approach  i6 
developed  and  presented  in  Chapter  IV.  It  is  called  the  "utility  function" 


method. 
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The  Decision-Theoretic  Approach 

Taylor  and  Russell  (1939)  took  the  first  major  step  toward  the 
decision- theoretic  approach.  They  contended  that  the  value  of  a  test 
varies  with  the  particular  decision  to  he  made,  and  that  the  problem 
Is  one  of  improving  selection  rather  than  of  simply  raising  the  cor¬ 
relation  of  a  test  with  some  criterion  measure.  They  showed  that 
considerable  benefit  can  be  obtained  from  tests  with  rather  low  va¬ 
lidity.  Benefit  was  defined  as  the  difference  between  the  proportion 
of  employees  likely  to  be  "satisfactory"  before  and  after  selection  by 
means  of  the  test.  This  difference  was  as  much  dependent  on  the  a 
priori  probability  (commonly  called  the  base  rate)  and  the  selection 
ratio  as  it  was  upon  the  validity  coefficient.  (Thi6  is  demonstrated 
in  Chapter  II.) 

The  next  major  advance  in  this  approach  came  18  years  later 
with  the  publication  of  the  monograph  Psychological  Tests  and  Personnel 
Decisions  by  Cronbach  and  Gleser  (1957)*  Cronbach  and  Gleser  took  the 
position  that  the  ultimate  purpose  of  any  personnel  testing  program  is 
to  assist  in  making  decisions  in  regard  to  what  should  be  done  with  an 
individual,  and  that  the  soundest  approach  to  evaluating  a  test  or 
testing  program  is  through  determining  the  benefits  which  accrue  to 
the  institution  or  individual  as  a  result  of  the  decisions  which  have 
been  made.  These  writers  used  the  concept  of  "utility"  as  a  measure 
of  test  value  and  defined  it  as  the  benefits  which  accrue  from  a  set 
of  decisions  less  the  total  costs  which  are  incurred  in  the  decision¬ 
making  process.  Thus,  this  approach  is  a  prapnatic  one  stressing  the 
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consequences  of  direct  action  (selection  decisions;  instead  of  abstract 
standards  of  predictive  efficiency. 

The  most  formidable  and  complex  aspect  of  carrying  out  this 
approach  in  practice  is  quantifying  the  relative  utility  of  decisions 
outcomes.  Cronbach  and  Closer  (1^57)  make  no  contribution  to  the  sol¬ 
ution  of  this  problem,  other  than  pointing  to  it  and  discussing  its 
relevance.  However,  there  is  an  extensive  history  of  value  measurement 
and  psychological  scaling  which  is  directly  applicable.  The  present 
study  attempts  to  draw  on  this  knowledge  for  a  solution  of  the  test 
evaluation  problem. 

It  should  be  noted  that  decision  theory  did  not  introduce  the 
problem  of  values  into  the  decision  process  and  hence  into  personnel 
selection.  It  does,  however,  make  it  explicit.  Value  systems  have 
always  entered  into  decisions,  but  they  were  net  heretofore  clearly 
recognized  or  systematically  handled. 

Plan  of  the  Study 

Chapter  II  is  devoted  to  demonstrating  some  of  the  limitations 
of  the  correlational  approach  for  evaluating  selection  tests.  Then 
in  Chapter  III  personnel  selection  on  the  basis  of  psychological  tests 
is  presented  in  statistical  decision  theory  terms.  It  is  shown  that 
this  theory  treats  the  base  rate,  selection  ratio  and  institutional 


gains  and  losses  explicitly  and  systematically.  This  formulation  of 
selection  test  theory,  unlike  the  Cronbach  and  Gleser  one,  is  restricted 
to  the  dichotomous  (or  dichotomized)  criterion  case  and  does  not  rely  on 
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the  correlation  coefficient  as  an  index  of  association  between  the  test 
and  the  criterion. 

Two  new  indices  for  evaluating  selection  tests  are  developed 
in  Chapter  IV.  One  is  based  on  statistical  decision  theory  as  pre¬ 
sented  in  Chapter  III  and  the  other  is  based  on  Brogden's  approach  (1949)* 

The  next  two  chapters,  V  and  VI,  deal  with  utilities  and  ways  to 
measure  them.  Two  psychological  scaling  methods  are  described  and  applied 
in  an  empirical  situation.  A  way  to  determine  payoff  matrices  given 
these  scale  values  is  presented.  This  method  is  applied  to  the  scale 
values  and  the  final  payoff  matrices  are  determined. 

An  empirical  tryout  of  the  new  indices  is  reported  in  Chapter 
VII.  Selection  test  scores  and  final  grades  were  obtained  for  large 
samples  of  students  in  U.  S.  Navy  technical  schools.  Bie  index  values, 
as  well  as  r  and  E,  are  presented  and  compared  in  terms  of  their  indi¬ 
cations  of  the  predictive  efficiency  and  utility  of  the  selection  test6. 
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CHAPTER  II 

LIMITATIONS  OF  THE  CORRELATIONAL  APPROACH 

There  can  be  no  doubt  that  validity  coefficients  dominate  the 
test  evaluation  scene.  Of  the  k26  abstracts  in  the  Handbook  of  Em- 
ployee  Selection  (195°),  236  use  a  validity  coefficient  as  the  sole 
measure  of  test  value.  Manuals  of  published  tests  rarely  report  any¬ 
thing  on  test  value  except  validity  coefficients.  Only  about  one-half 
of  the  reviews  of  aptitude  tests  in  the  Fifth  Mental  Measurement  Year¬ 
book  (1959)  cite  any  evidence  of  test  value  other  than  validity  co¬ 
efficients.  Of  the  32  abstracts  in  the  "Validity  Information  Exchange" 
of  Personnel  Psychology  in  1959*  not  a  single  one  reported  any  numerical 
analysis  indicating  test  value  except  validity  coefficients. 

The  inappropriateness  of  validity  coefficients  as  selection 
test  evaluation  indices  is  due  to  the  following  four  limitations.  (In 
each  case  the  statistical  assumptions  underlying  validity  coefficients, 
namely,  normality,  linearity,  and  horaoscedasticity  are  granted.  Since 
both  of  the  special  product-moment  correlation  coefficients  recommended 
under  these  assumptions--^  and  r^'-are  approximations  to  a  Pearson 
r  and  are  generally  equivalent  to  it  when  these  assumptions  are  true 
[Guilford,  1956,  pp.  297*31°]  >  the  limitations  apply  to  them  as  well. 

The  question  as  to  whether  the  limitations  also  apply  to  phi  is  not 
raised  because  point  distributions,  or  "genuine  dichotomies,"  are  not 
discussed. ) 
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Validity  Coefficients  are  Independent 
of  the  Selection  Ratio 

The  selection  ratio  is  the  proportion  of  applicants  (or  pop¬ 
ulation  tested)  to  be  accepted.  It  may  be  any  proportion  between  zero 
and  1.00.  The  validity  coefficient  is  independent  of  the  selection 
ratio  but  test  value  Is  not.  Consider  Table  1  where  the  entries  are 
the  proportion  of  accepted  applicants  who  are  satisfactory  in  terms  of 
Job  proficiency.  These  entries  can  be  compared  with  the  a  priori  prob¬ 
ability  .50  which  is  the  proportion  that  would  have  been  satisfactory 
had  selection  been  random.  The  variation  in  each  row  shows  vari¬ 
ation  in  test  value  which  is  not  accounted  for  by  the  correlation 
between  test  score  and  Job  proficiency.  Take  for  instance  the  row 
pertaining  to  an  r  of  .50;  if  the  selection  ratio  is  .05,  66  per  cent 
will  be  satisfactory,  a  sizeable  increase  over  the  a  priori  probability; 
if  the  selection  ratio  is  .95,  52  per  cent  will  be  satisfactory,  a  very 
slight  improvement  over  the  a  priori  probability.  The  correlation  may 
not  adequately  indicate  the  value  of  the  test  in  any  specific  situation. 
It  can  be  seen  from  the  table  that  a  test  with  almost  any  validity  may 
or  may  not  be  of  much  value  depending  upon  the  selection  ratio. 

Validity  Coefficients  are  Independent 
of  the  A  Priori  Probability 

The  a  priori  probability  is  the  proportion  who  will  be  satis¬ 
factory  if  selection  is  random.  Test  value  is  very  much  dependent  upon 
the  a  priori  probability  but  validity  coefficients  are  not.  Table  2  will 
clarify  this.  As  in  Table  1  the  entries  are  the  proportions  of  accepted 
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TABLE  1 

THE  PROPORTION  WHO  WILL  BE  SATISFACTORY  AMONG  THOSE 
SELECTED,  WHEN  THE  A  PRIORI  PROBABILITY  IS  .50 
(FROM  TAYLOR  AND  RUSSELL,  1939) 


r 

.05 

.10 

Selection  Ratio 

.20  .30  .40  .50 

.60 

•  70 

.80 

•  90 

-95 

.00 

•  50 

.50 

.50 

.50 

.50 

•  50 

.50 

•  50 

•  50 

•  50 

.50 

.05 

•  54 

.54 

•53 

.52 

.52 

.52 

.51 

•  51 

•  51 

•  50 

.50 

.10 

•  58 

.57 

.56 

.55 

.54 

•  53 

-53 

•  52 

•  51 

•  51 

.50 

•  15 

.63 

.61 

•  58 

.57 

.56 

•  55 

•  54 

-53 

-52 

•  51 

•  51 

.20 

.67 

.64 

.61 

.59 

*58 

.56 

-55 

•  54 

•53 

•52 

•  51 

.25 

.70 

•  67 

.64 

.62 

.60 

•  58 

.56 

•  55. 

•  54 

•  52 

-51 

•  30 

•  74 

•71 

.67 

.64 

.62 

.60 

•  58 

•  56 

-54 

-52 

•  51 

•  35 

•  78 

•  74 

•70 

.66 

.64 

.61 

•  59 

.57 

•  55 

-53 

-51 

.Uo 

.82 

•  78 

•73 

.69 

•  b6 

.63 

.61 

•  58 

.56 

•53 

-52 

.45 

•  85 

.81 

•  75 

•  71 

.68 

.65 

.62 

•  59 

-56 

-53 

-52 

.50 

.88 

.84 

•78 

.74 

.70 

.67 

.63 

.60 

-57 

-54 

•  52 

.55 

•  91 

.87 

.81 

.76 

*72 

.69 

.65 

.61 

.58 

-54 

•  52 

.60 

•  94 

.90 

.84 

•  79 

-73 

•  70 

.66 

.62 

-59 

•  54 

-52 

.65 

.96 

•  92 

.87 

.82 

•  77 

*73 

.68 

.64 

•  59 

•  55 

.52 

•  70 

•  98 

•  95 

•  90 

.85 

.80 

•  75 

.70 

-65 

.60 

-55 

-53 

•  75 

•  99 

•  97 

•  92 

.87 

.82 

•  77 

-72 

.66 

.61 

•  55 

-53 

.80 

1.00 

•  99 

•  95 

•  90 

.85 

.80 

-73 

.67 

.61 

-55 

-53 

.85 

1.00 

*99 

•  97 

.94 

.88 

.82 

.76 

.69 

.62 

-55 

•  53 

.90 

1.00 

1.00 

•  99 

•  97 

*92 

.86 

-78 

•  70 

.62 

.56 

-53 

•  95 

1.00 

1.00 

1.00 

•  99 

.96 

.90 

.81 

•  71 

.63 

-56 

-53 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

•  83 

•  71 

.63 

.56 

-53 

18 


TABLE  2 

THE  PROPORTION  WHO  WILL  BE  SATISFACTORY  AMONG  THOSE 
SELECTED,  WHEN  THE  SELECTION  RATIO  IS  -50 
(FROM  TAYLOR  AND  RUSSELL,  1939) 


A  Priori  Probability 


r 

.05 

.10 

.20 

.30 

.40 

•  50 

.60 

.70 

.80 

.90 

.95 

.00 

.05 

.10 

.20 

.30 

.40 

.50 

.60 

.70 

.80 

.90 

.95 

.05 

.05 

.11 

.21 

.31 

.42 

•  52 

.62 

.71 

.81 

.91 

.95 

.10 

.06 

.11 

.22 

.33 

.43 

•53 

.63 

.73 

.82 

.91 

.96 

•  15 

.06 

.12 

•23 

•  34 

.45 

•  55 

.65 

.74 

.83 

.92 

.96 

.20 

.07 

.13 

•25 

.36 

.46 

.56 

.66 

.76 

.84 

.93 

.97 

.25 

.07- 

.13 

.26 

.37 

.48 

•  58 

.68 

.77 

.86 

.93 

.97 

.30 

.07 

.14 

.27 

.38 

•  49 

.60 

.69 

.78 

•  87 

.94 

.97 

•  35 

.08 

.15 

.28 

.1*0 

•  51 

.61 

•  71 

.30 

.89 

.95 

.98 

.40 

.08 

.16 

.29 

.41 

•  53 

.63 

.73 

.81 

•  89 

.95 

.98 

.45 

.08 

.16 

.30 

•43 

•  54 

.65 

•  74 

.83 

•  90 

.96 

.98 

•  50 

.09 

.17 

.31 

.44 

.56 

.67 

.76 

.84 

.91 

.97 

.99 

•  55 

.09 

.17 

.32 

.46 

•  58 

.69 

•  78 

.86 

•  92 

.97 

.99 

.60 

.09 

.18 

•31* 

•  47 

.60 

•  70 

.c30 

.87 

•  94 

.98 

.99 

.65 

.10 

.18 

•35 

•  49 

.62 

•73 

.82 

.89 

•  95 

.98 

1.00 

•  70 

.10 

.19 

.36 

.51 

.64 

•  75 

.84 

•  91 

.96 

.99 

1.00 

•  75 

.10 

.19 

•37 

.52 

.66 

•  77 

.86 

•  92 

•  97 

.99 

1.00 

.80 

.10 

.20 

•  38 

•  54 

.68 

.80 

.88 

.94 

•  98 

1.00 

1.00 

.85 

.10 

-.20 

•  39 

.56 

•  71 

.82 

•  91 

.96 

•  99 

1.00 

1.00 

.90 

.10 

.20 

.UO 

•  58 

•  74 

.86 

•  94 

•  98 

1.00 

1.00 

1.00 

•  95 

.10 

.20 

.40 

.60 

•  77 

•  90 

•  97 

•  99 

1.00 

1.00 

1.00 

1.00 

.10 

.20 

.1+0 

.60 

.80 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 
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applicants  who  are  satisfactory  in  terms  of  job  proficiency.  Comparison 
of  each  entry  with  the  appropriate  a  priori  probability,  i.e.,  the  one 
that  heads  the  column  in  which  the  entry  is  located,  provides  a  meaning¬ 
ful  indication  of  test  value.  The  difference  between  the  a  priori 
probability  and  an  entry  is  the  improvement  over  chance  which  the  pre¬ 
dictor  makes  possible.  The  variation  in  these  differences  within  any 
row  is  the  variation  in  test  value  that  is  not  accounted  for  by  the 
validity  coefficient  which  heads  that  row.  For  example,  the  differences 
between  the  a  priori  probabilities  and  the  entries  in  the  row  pertaining 
to  an  r  of  .50  are  .04,  .u7,  .11,  .14,  .16,  J.7,  *lt  .14,  .11,  .07, 

.04.  These  differences  for  an  r  of  .00  are  .05,  .10,  .16,  .24,  .26, 

.50,  .26,  .24,  .16,  .10,  .05. 

We  may  conclude  therefore,  that  the  validity  coefficient  may 
not  adequately  represent  the  value  of  a  test  in  a  specific  situation. 
Even  a  very  high  correlation  is  not  very  good  evidence  that  the  test  is 
worth  much.  A  test  that  correlates  .90  with  a  criterion  may  be  worth  no 
more  than  a  test  "  it  correlates  .30  with  a  criterion:  when  the  first 
criterion  has  an  a  priori  probability  of  .10  or  .90  and  the  second 
criterion  has  an  a  priori  probability  of  -50--the  differences  between 
the  a  priori  probabilities  and  the  corresponding  entries  in  the  table 
are  equal. 


All  Errors  of  Measurement  Attenuate 
the  Validity  Coefficient 

When  all  observations  in  the  criterion- test  plot  fall  in  a 
straight  line,  the  correlation  is  perfect,  i.c.,  r  •  1.00.  Any 
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deviations  from  a  straight  line  result  in  an  r  less  than  1.00.  Such 
deviations  are  said  to  "attenuate"  r.  Therefore,  when  r,  or  any  corre¬ 
lation  coefficient  derived  from  r  such  as  the  biserial  r  and  the 
tetrachoric  r,  is  used  as  an  evaluation  index,  the  assumption  is  im¬ 
plicit  that  all  deviations  from  the  line  representing  perfect  correlation 
are  important.  In  other  words  all  such  deviations  are  assumed  to  have 
practical  significance.  It  can  be  argued,  however,  that  only  deviations 
which  affect  the  decision  for  which  the  test  is  used  should  attenuate 
the  evaluation  index. 

When  a  psychological  test  is  used  as  an  aid  in  making  decisions, 
the  most  common  practice  is  to  set  a  cutoff  on  the  test  scale  and  make 
one  decision  about  persons  who  receive  a  score  above  that  point  and  the 
complementary  decision  about  persons  who  receive  a  score  below  that 
point.  In  the  personnel  selection  situation  the  decisions  are  to 

accept  or  to  reject  the  persons  for  the  assignment.  Such  a  situation 
is  depicted  in  Figure  1.  The  cutoff  is  labeled  x£.  The  line  passing 


Criterion 


Fig.  l.--An  exemplary  scatter  plot  showing  the  regression 
line  and  the  cutoff  used  in  making  decisions. 
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through  the  plot  is  the  regression  line,  the  line  of  best  fit  in  a 

least-squares  sense  (Guilford,  1956,  p.  366).  If  a  person  whose  score 

on  the  test  exceeds  x  receives  a  score  located  at  y,  on  the  criterion, 

c  "  l  ' 

the  test  could  be  said  to  have  made  an  erroneous  decision  since,  had 
the  test  predicted  perfectly,  this  person  would  have  been  rejected. 
However,  if  this  "accepted"  person  received  a  criterion  score  above 
y2»  regardless  of  which  one,  the  decision  based  on  the  test  mu6t  be 
considered  correct.  Similarly,  the  decision  to  reject  a  person  mu6t 
be  considered  correct  if  his  criterion  score  is  below 

The  establishment  of,  and  adherence  to,  a  cutoff  divides  the 
scatter  plot  into  four  areas  shown  in  Figure  2.  Deviations  from  the 
regression  line  in  areas  B  and  C  are  not  errors  and  should  not  attenuate 


Criterion 


Test 

Fig.  2.--A  scatter  plot  showing  the  four  decision- 
related  areas  determined  by  the  cutoff  and  the  regression  line. 

the  evaluation  index  if  it  is  to  be  taken  as  an  estimate  of  the  value 

of  the  test  in  this  decision  situation.  Only  deviations  which  lead  to 

an  erroneous  decision  should  be  considered  errors.  Uiese  are 
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observations  which  fall  in  areas  A  and  D.  Validity  coefficients,  of 
course,  consider  every  observation  that  falls  off  the  regression  line 
as  an  error  regardless  of  its  importance  to  the  decision. 

Furthermore,  the  size  of  a  deviation  from  the  regression  line 
in  areas  B  and  C  is  irrelevant.  All  observations  in  each  of  these 
cells  should  receive  equal  weight  in  the  evaluation  index  since  they 
are  all  equally  correct — a  perfect  test  would  have  led  to  the  sane 
decision  in  every  case  and  to,  therefore,  the  same  consequences.  This 
is  not  true  of  the  validity  coefficient,  which  weights  observations  in 
proportion  to  the  size  of  their  deviations  from  the  regression  line. 

It  seems  reasonable  tc  contend  that  differential  weighting  within  these 
areas  is  illogical  when  attempting  to  determine  the  value  of  a  test  for 
a  dichotomous  decision. 

Validity  Coefficients  do  not  Adequately  Reflect 
Institutional  Gains  and  Losses 

A  validity  coefficient  in  selection  testing  is  an  index  of 
strength  of  predictive  association  between  a  selection  test  and  a 
criterion  (usually  some  measure  of  performance).  As  such,  the  only 
link  with  institutional  gains  and  losses  is  through  the  criterion. 
Implicit  in  the  use  of  r  as  an  evaluation  index  is  the  assumption  that 
the  utility  function  of  the  criterion  is  linear,  i.e.,  that  equal  in¬ 
crements  of  the  criterion  represent  equal  increments  of  utility  or 
value  to  the  institution  using  the  test.  This  assumption  is  rarely 
tested  with  quantitative  research.  In  fact,  it  is  rarely  mentioned 


in  the  psychometrics  literature. 

Following  the  logic  of  the  previous  section,  a  more  reasonable 
assumption  in  general  for  selection  tests  would  be  that  the  utility 
function  is  stepwise  about  the  point  on  the  criterion  corresponding 
to  the  test  cutoff.  Consideration  of  thi6  point  is  what  usually 
leads  to  the  choice  of  the  cutoff.  It  seems  reasonable  to  expect  the 
criterion  units  around  this  point  to  be  more  important  to  the  institution 
than  those  far  above  or  below  thi6  point. 

Actually,  of  course,  the  shape  of  the  utility  function  of  the 
criterion  in  an  applied  situation  i6  an  empirical  question  to  be 
answered  ideally  through  research.  In  the  absence  of  such  research 
the  most  reasonable  assumption  should  be  stated  and  an  evaluation  index 
used  which  does  not  violate  that  assumption.  In  selection  test  evalu¬ 
ation  it  would  seem  that  any  evaluation  index  based  on  product-moment 
correlation  theory  should  be  avoided. 

Another  point  mentioned  in  the  previous  section  is  that  obser¬ 
vations  which  fall  off  the  regression  line  are  weighted  by  the  validity 
coefficient  in  proportion  to  their  distance  from  the  regression  line. 
Institutional  gains  and  losses  are  not  expressly  taken  into  account. 

The  two  extreme  types  of  deviations  are  commonly  called  false  positives 
and  false  negatives.  (In  subsequent  chapters  these  are  called  erroneous 
acceptees  and  erroneous  rejectees.)  The  implicit  assumption  in  cor¬ 
relational  analysis  is  that  these  are  equally  costly  to  the  institution 
using  the  test.  Whether  or  not  they  are  equally  costly  is  an  empirical 
question.  Their  actual  cost  to  the  institution  should  be  determined 
through  research. 


2k 

In  this  chapter  some  of  the  inadequacies  of  the  conventional 
approach  to  selection  test  evaluation  have  been  discussed  shoving  that 
a  new  approach..! s  needed  and  that  a  more  adequate  approach  should  handle 
the  following  factors: 

(1)  selection  ratio, 

(2)  a  priori  probability, 

(3)  institutional  gains  and  losses. 

The  next  chapter  presents  the  theoretical  foundation  of  an 
approach  based  on  statistical  decision  theory  which  handles  these 
factors  explicitly  and  systematically. 
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CHAPTER  III 

SELECTION  TESTS  AND  STATISTICAL  DECISION  THEORY 

The  monograph  by  Cronbach  and  Gleser  (1957)  was  the  first  and 
roost  direct,  large  scale  restatement  of  test  evaluation  theory  in  the 
decision- theoretic  framework.  The  present  chapter  outlines  a  somewhat 
simpler,  more  straightforward  approach  to  what  Cronbach  and  Gleser  call 
"selection  decisions  with  single-stage  testing,"  which,  unlike  their 
approach,  does  not  rely  on  correlation  coefficients.  It  is  restricted 
to  situations  in  which  the  criterion  is  dichotomous  (or  dichotomized) 
and  the  test  score  is  continuous. 

Statistical  decision  theory  specifies  the  optimum  decision 
in  a  situation  where  one  must  choose  between  two  alternative  statistical 
hypotheses  on  the  basis  of  an  observed  event.  In  particular,  it  spec¬ 
ifies  the  optimum  cutoff,  along  the  continuum  on  which  the  observed 
events  are  arranged,  as  a  function  of  (a)  the  a  priori  probabilities 
of  the  two  hypotheses,  (b)  the  values  and  costs  associated  with  the 
various  decision  outcomes,  and  (c)  the  amount  of  overlap  of  the  dis¬ 
tributions  that  correspond  to  the  hypotheses.  See  especially  Chernoff 
and  Moses  (1959) »  Good  (1962),  Mar. '■hak  (195*0>  and  Swetts  et  al.  (1961). 

In  applied  psychology,  selection  tests  are  most  often  used  to 
make  a  simple  yes-no  decision  in  terms  of  such  things  as  hiring,  pro¬ 
motion,  training,  etc.  A  particular  dichotomous  decision  represents 
predictions  (or  hypotheses)  based  on  a  test  score.  In  Figure  3  test 


26 


score  is  labeled  x  and  plotted  on  the  abscissa.  The  left-hand  dis¬ 
tribution,  labeled  fj,(x),  is  the  probability  density  function  of  x 
given  a  person  -ho  would  "fail."  The  right-hand  distribution  is  the 
probability  density  function  of  x  given  a  person  who  would  "succeed." 
(Probability  density  functions  are  used,  rather  than  probability 
functions,  since  x  is  assumed  to  be  continuous.)  Although  the  distri¬ 
butions  appear  to  be  normal  and  equally  variant,  the  selection  test 
model  presented  below  assumes  neither. 


P 
r 

o  D 
b  e 
a  n 
b  s 
i  i 
1  t 
i  7 
t 

y  Test  Score  (x) 

Fig.  3. --The  probability  density  function  of  Fail  and  Succeed. 


The  basic  decision  is  whether  a  given  tes>L  score  arises  from 
one  distribution  or  the  other,  or,  equivalently,  the  relative  probabi¬ 
lities  that  a  person  obtaining  that  score  will  succeed  or  fail.  It  is 
desirable  to  establish  a  standard,  a  cutoff  xc  on  the  continuum  of  test 
scores,  to  which  any  given  score  x^  can  be  related.  If  it  is  found  for 
the  i-th  test  score,  x^  that  >  x^,  the  decision  is  to  "accept";  if 

x  <  x  ,  the  decision  is  to  "reject." 
i  c 

In  the  language  of  statistical  decision  theory  a  subset  of  all 
the  scores,  namely  a  Critical  Region  A  (accept),  is  chosen  such  that  a 
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test  score  in  this  subset  leads  to  acceptance  of  tbe  Hypothesis  S,  to 

the  prediction  that  the  person  will  succeed.  All  other  scores  are  in 

the  complementary  subset  R  (reject);  these  lead  to  rejection  of  the 

Hypothesis  S,  or,  equivalently,  to  the  acceptance  of  the  Hypothesis  F, 

to  predict  the  person  will  fail.  The  Critical  Region  A,  with  reference 

to  Figure  3*  consists  of  the  values  of  x  to  the  right  of  60me  cutoff  x  . 

~  c 

The  decision  outcome  may  be  a  correct  acceptance  (A,S--the 
Joint  occurrence  of  a  score  in  Region  A  and  success),  a  correct  re¬ 
jection  (R,F),  an  erroneous  rejection  (R,S),  or  an  erroneous  acceptance 
(A,F).  If  the  a  priori  probability  of  a  success  and  the  parameters 
of  the  distributions  of  Figure  3  are  fixed,  the  choice  of  a  cutoff 

value  x  completely  determines  the  probability  of  each  of  these  out- 
c 

comes. 

Clearly,  the  four  probabilities  are  interdependent.  For  ex¬ 
ample,  an  increase  in  the  probability  of  a  correct  acceptance,  P(A,S), 
can  be  achieved  only  by  accepting  an  increase  in  the  probability  of  an 
erroneous  acceptance,  P(A,F),  and  decreases  in  the  other  probabilities, 
P(R,S)  and  P(R,F).  Thus,  a  given  cutoff  yields  a  particular  balance 
among  the  probabilities  of  the  four  possible  outcomes;  conversely,  the 
balance  desired  in  any  instance  will  determine  the  optimum  location  of 
the  cutoff.  Now  one  may  desire  the  balance  that  maximizes  the  expected 
value  of  decisions  where  the  four  possible  outcomes  have  individual 
utilities.  One  may,  however,  desire  a  balance  that  maximizes  some 
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other  quantity--i.e. ,  a  balance  that  is  optimum  according  to  some 
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other  definition  of  optimum- -in  which  case  a  different  cutoff  will  be 
appropriate.  One  may,  for  example,  want  to  maximize  P(A,S)  while 
satisfying  a  restriction  on  P(A,F),  as  one  typically  does  when  as 
an  experimenter  one  assumes  an  .O'?  or  .01  level  of  confidence.  A 1' 
nately,  one  may  want  to  maximize  the  number  of  correct  decisions. 

The  manner  of  specifying  the  optimum  cutoff  will  be  illustrated 
for  just  one  of  these  definitions  of  optimum,  namely,  the  maximization 
of  the  total  expected  value  (or  utility)  of  a  decision  in  a  situation 
where  the  four  possible  outcomes  of  a  decision  have  individual  utilities 
associated  with  them.  The  expected  utility  (EU)  of  a  strategy  is 
defined  in  statistical  decision  theory  as  the  sum,  over  the  potential 
outcomes  of  a  decision,  of  the  products  of  probability  of  outcome  and 
the  desirability  (utility)  of  outcome: 

eu  -  p(a,s)ua  s  +  p(a,f)ua>f  +  p(r,f)ur^f  +  p(r,s)ur,s. 

In  this  equation  U.  _,  U  ,  U  ,  U  ,  are  the  utilities  of  a 
Ajb  A,f  *»|b 

correct  acceptance,  an  erroneous  acceptance,  a  correct  rejection,  and 
an  erroneous  rejection,  respectively.  For  any  observed  value,  x^the 
expected  utility  of  the  decision  to  accept  is: 

“a  ■  p(sK)ua,s  +  P(F|  W- 

where  P(S|xi)  is  the  probability  of  a  "success"  conditional  upon,  or 
given,  x^;  P(F|x^)  is  the  probability  of  a  "fail,"  given  x^.  Similarly, 
the  expected  utility  of  the  decision  to  reject  is  given  by 

-  PW»1)UR,F  +  P<SK>\s- 


In  statistical  decision  theory  the  optimum  cutoff  is  specified 
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Multiplying  both  sides  of  Equation  (3)  by  P(F)/P(S)  yields 


P(F)P(s|x1)  P(F)(Urf  -  UAF) 
P(S)P<F|x1)  '  P(s)(UAj's  -  UR|S)  • 


The  left-hand  term  of  Equation  (5)  is  the  likelihood  ratio  given  in 
Equation  (4).  Thus,  the  likelihood  ratio  at  the  optimum  cutoff  has 
been  shown  to  be  equal  to  the  right-hand  term  in  Equation  (5).  That 
is,  it  is  the  point  on  the  x  continuum  where  Equation  (5)  is  true. 

The  optimum  cutoff  can  be  specified  by  some  value  B  of  X(x). 
This  value  can  now  be  given  as 


B 


P<S)‘UA,S  *  VsJ 


9 


(6) 


since,  when  \[x)  >  B,  EUft  >  EUR,  and  when  \(x)  <  B,  EUft  <  EUR.  This 

can  be  seen  by  noting  that  when  X(x)  >  B  this  inequality  will  also  be 

true  of  Equations  (5),  (3)>  (2),  and  (l);  consequently  EUA  >  EUR. 

Similarly,  when  \(x)  <  B  this  inequality  will  be  true  of  the  same 

equations,  and  EU  <  EU  .  The  decision  should  therefore  be  to  "accept" 

whenever  \{x)  >  B  and  to  "reject"  whenever  X(x)  <  B.  The  former  will 

be  true  only  when  x  >  x  and  the  latter  only  when  x  <  x  >  provided 

c  c 

that  X(x)  Is  monotonic  increasing  with  x,  U^s  >  UR  g,  and  *^>1^ 


Thus  the  Critical  Region  A  lies  to  the  right  of  xc  and  the  Critical 
Region  R  lies  to  the  left  of  xc» 

This  constitutes  the  model  of  test-based  selection  decisions 
from  the  standpoint  of  statistical  decision  theory.  The  cutoff  (and 
therefore,  the  selection  ratio),  a  priori  probability  and  institutional 


gains  ana  losses  are  central  factors.  An  evaluation  index  has  1  on  this 
model  is  presented  in  the  next  chapter.  Chapters  V  and  VI  deal  with 
utilities  and  ways  to  measure  them. 
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CHAPTER  IV 

TWO  NEW  METHODS  FOR  EVALUATING  SELECTION  TESTS 

In  this  chapter  an  index  for  evaluating  selection  tests  which 
is  based  on  the  model  presented  in  the  previous  chapter  is  developed. 
It  will  be  seen  that  no  index  of  association  is  needed  because  the 

evaluation  index  represents  a  direct  measure  of  the  improvement  over 
chance  prediction  provided  by  the  test.  In  the  final  section  of  this 
chapter  is  presented  an  index  based  on  the  method  developed  by  Brogden 
(19j!9)  which  also  purports  to  indicate  the  utility  of  selection  tests. 

Decision-Theoretic  Method 

The  starting  point  of  this  method  is  a  payoff  matrix.  When  a 
cutoff -on  the  test  is  used  and  the  outcomes  to  be  predicted  form  a 
dichotomy,  the  payoff  matrix  is  as  shown  in  Figure  4;  where  U^>  U2, 

U^>  and  are  utilities  which  correspond  to  erroneous  rejection, 
correct  acceptance,  correct  rejection,  and  erroneous  acceptance,  re¬ 
spectively.  (See  Chapter  VI  for  a  thorough  explanation  of  payoff 
matrices.) 


Succeed 

Criterion 
(Job  A)  . 

Fail 


Fig.  4. --The  standard  payoff  matrix. 


U1 

u2 

U3 

U4 

Reject  Accept 
Decision 


Assume  that  100  persons,  selected  at  random,  have  been  assigned 
to  Job  A.  The  utility  equation  for  an  obtained  table  Is 

U  -  n1U1  +  n2U2  +  n3U3  +  n^,  (l) 

where  the  n's  are  the  frequencies  in  the  corresponding  cells  of  the 
contingency  table  shown  in  Figure  5>  To  estimate  the  utility  of  a 


1  -  q  q 


Succeed 

Criterion 
(Job  A) 

Fail 


Low  High 
Test 

Fig.  5. --The  standard  2X2  contingency  table, 
test  to  the  decision-making  process,  U  must  be  compared  with  the  one 
that  would  result  with  a  test  of  zero  utility,  i.e.,  one  providing 
only  chance  prediction,  Uc-  When  the  observations  in  the  contingency 
table  are  randomly  distributed,  each  cell  frequency  is  the  product  of 
the  corresponding  marginal  probabilities  and  N.  (N  *  n^  +  n2  +  n3  +  n^) 
Therefore 

Uc  -  (p  -  pq)NUx  +  pqNUg  +  (l  +  pq  -  p  -  q)NU3  +  (q  -  pq)NU4,  (2) 

where  £  is  the  a  priori  probability  and  £  is  the  selection  ratio  as 
shown  in  Figure  5.  Then,  the  utility  of  the  test  is  given  by  the  differ¬ 


ence  between  U  and  U  : 

-  c 

UT  “  U  '  Uc 


(3) 


This  procedure  can  be  simplified  and  made  to  fit  the  usual  test 


3* 


evaluation  situation  where  n^  and  n^  are  not  known.  It  can  be  shown 
(see  Appendix  A)  that  U^,  is  independent  of  the  addition  of  any  con¬ 
stant  (positive  or  negative)  to  the  values  of  both  entries  in  a  row 
of  the  payoff  matrix.  Since  only  the  individuals  above  the  cutoff 
on  the  test,  the  accepted  group,  are  available  to  the  test  evaluater, 
the  most  useful  payoff  matrix  is  the  one  shown  in  Figure  6. 


Criterion 
(job  A) 


Succeed 


Fail 


0 

Vui 

0 

V°3 

Reject  Accept 
Decision 


Fig.  6. --The  modified  payoff  matrix  obtained  by  subtracting  U 
from  the  first  row  and  from  the  second  row. 


Then, 


U  -  n2(U2  -  Ux)  +  n4(U4  -  l^) 


00 


and 


Uc  "  P°N(U2  -  Ux)  +  (q  -  pq)N(U4  -  U  ).  (5) 

Since  N  and  o  are  unknowns,  substitute  for  3  its  equivalent, 

(ng  +  n4)/N: 

(n  +  n.  1  In  +  n,  \ 

-  Ux)  +  (1  -  -  U3).  (6) 

How  N  cancels’ and  U  becomes 
—  c 

uc  ■  p(n2  +  n4)(U2  -  Ux)  +  (1  -  p) (r»2  +  n4)(U4  '  ^).  (7) 

Again,  the  difference  between  U  and  Uc  equals  the  gain  in  utility  due 


to  the  test: 


U  -  U  -  U 
T  c 
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This  Uj  is  equal  to  the  one  obtained  prior  to  changing  the  pay¬ 
off  matrix.  Appendix  A  presents  the  mathematical  proof. 

Example:  Assume  that  100  men  were  assigned  to  electronic  train¬ 
ing  and  that,  after  training,  the  graduates  and  fails  are  distributed 
as  in  Table  3* 


TABLE  3 

A  HYPOTHETICAL  CONTINGENCY  TABLE 


Criterion 


Graduate 

Fall 


20 

60 

15 

5 

Low  High 


Test 


Assume  further  that  the  consequences  of  the  four  decision-outcome 
combinations  have  been  considered  (see  Chapter  VI)  and  the  payoff 
matrix  shown  in  Table  4  ha6  been  determined. 

TABLE  4 

A  HYPOTHETICAL  PAYOFF  MATRIX 


Criterion 


Graduate 

Fail 


-8 

10 

12 

-6 

Reject  Accept 


Decision 


The  U  equation  for  this  example  is 
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U  -  20(-8)  +  60(10)  +  15(12)  +  5(-6) 

-  -l60  +  600  +  180  -  30 
■  590. 

The  U  equation  for  chance  prediction  is 

U  -  (.8  -  .52)lOO(-8)  +  (.52)100(10)  +  (1  +  .52  -  .8  -  .65) 
c 

100(12)  +  (.65  -  .52)lOO(-6) 

-  28(-8)  +  52(10)  +  7(12)  +  13(-6) 

-  302. 

The  utility  of  the  test  is 

UT  -  590  -  302  •  288. 

If  the  payoff  matrix  is  simplified  as  shown  above  it  becomes  the 
one  presented  in  Table  5;  then, 

U  -  60(18)  +  5(-lS)  •  990 

and 

U  -  .8(60  +  5) (18)  +  .2(60  +  5)(-l8) 
c 

-  52(18)  +  13(-18) 

-  702. 

The  utility  is  the  same  as  before: 

U  -  990  -  702  -  288. 

T 

An  assumption  explicit  in  this  method  is  that  the  cutoff  has 
been  set  at  the  best  possible  point  on  the  test.  If  an  inflexible 
quota  must  be  filled  this  assumption  is  of  no  consequence.  However, 
many  times  it  is  of  value  to  determine  the  best  possible  cutoff,  i.e., 
one  that  balances  the  positive  and  necative  utilities  of  correct  and 
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{ 

f 

* 

i 


TABLE  5 

A  MODIFIED  VERSION  OF  A  HYPOTHETICAL  PAYOFF  MATRIX 


Criterion 


Graduate 

Fail 


0 

18 

0 

-18 

Reject  Accept 
Decision 


erroneous  decisions, 
matrix  is  available, 
on  the  test  where 


This  point  can  be  easily  determined  if  a  payoff 
It  has  been  shown  in  Chapter  III  to  be  the  point 


\(x) 


P(S><UA,S 


UR,S^ 


(8) 


where  \(x)  is  the  likelihood  ratio  fg(x)/ fj,(x) .  In  the  symbolism  of 
contingency  tables  and  payoff  matrices,  the  right-hand  term  of  Equation 


(8)  is 


(1  -  p)P>,  •  u4) 

p(U2  -  UA)  ’ 


The  test  will  be  of  greatest  utility  if  the  cutoff  is  set  at  the  point 


where 


\(x) 


(1  -  P)(0,  -  \) 

p(u2  -  uA)  ' 


(9) 


or,  according  to  Equation  (4)  in  Chapter  III,  where 
(1  -  p)P(S|x1)  (1  -  p)(U3  -  Uu) 

pP(F(Xl)  p(U2  -  Ux)’ 


(10) 


38 


which  can  be  reduced  to 


p(s|Xi)  u3  -  u4 
“  u2  -  ux  • 

Utility  Function  Method 


(li) 


This  method  is  essentially  the  one  developed  by  Brogden  (19**9)« 
He  was  concerned,  however,  with  the  case  in  which  the  test  is  dichot¬ 
omous  and  the  criterion  is  continuous.  The  method  is  described  here 
for  the  case  in  which  both  variables  are  in  dichotomous  form. 

Die  criterion  is  translated  into  utility  terms  and  the  "gain" 


per  man  selected  is  ccwiputed.  Consider  Figure  5  in  the  preceding 
section  where  the  observations  are  a  random  sample  of  size  N  (N  ■ 


All  N  persons  have  been  assigned  to  Job  A.  Test  scores  have  been 
obtained  for  all  N  persons  prior  to  their  assignment  to  Job  A.  Cri¬ 
terion  scores,  succeed  and  fail,  have  been  assigned  on  the  basis  of 
performance  in  Job  A  and  translated  into  utility  terras.  An  individual's 
criterion  score  is  his  utility  in  Job  A.  Diese  utility  values  are  here¬ 
after  labeled  lg  and  U^. 


From  the  -bove  definitions  the  following  statistics  can  be 
determined: 

(i^l  ♦  n2)Us  +  (n^  +  n^)UF 
^ - - 


(12) 


Equation  (12)  can  be  interpreted  as  "the  mean  utility  for  a  random 
sample  of  individuals  assigned  to  Job  A." 
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n  U  +  n.  U 
2  S  k  ? 

n.  +  n, 

2  H 


(13) 


Equation  (13)  can  be  interpreted  as  "the  average  utility  for  the  sub¬ 


group  of  a  random  sample  of  individuals  who  are  high  on  the  test  when 
they  are  assigned  to  job  A." 


G 


U 


(1*0 


The  value  "Gy"  defined  in  Equation  (14)  is  the  gain  in  utility  which 


would  be  realized,  on  the  average,  by  assigning  individuals  to  Job  A 
on  the  basis  of  the  test,  rather  than  at  random. 

Example:  Assume  that  100  Navy  recruits  were  assigned  at  random 
to  electronic  training  and  that,  after  training,  the  graduates  and  fails 
(non-graduates)  are  distributed  as  in  Table  3  in  the  previous  section. 
Assume  further  that  a  graduate  is  worth  100  utiles  (the  unit  of  measure¬ 
ment  on  the  utility  scale)  to  the  Navy,  and  a  fail  is  worth  to  utiles 
to  the  Navy.  The  total  utility  of  these  men  to  the  Navy  is  easily 
determined.  There  are  80  graduates,  worth  100  utiles  each,  or  8,000 
utiles  altogether.  There  are  20  fails  worth  to  utiles  each,  or  800 
utiles  altogether.  Thus,  the  total  utility  for  the  group  is  6,800 
utiles.  The  average  utility  for  the  men  assigned  to  electronics  training 
Is 


K, 


8800 

100 


88. 


For  men  high  on  the  test,  the  average  utility  is  similarly  determined 


to  be 

u  .  .  s5.38. 


kO 

Then  the  gain  per  man  Is 

Gy  -  95-38  -  88  -  7.38. 

The  conclusion  would  be  that,  provided  the  manpower  pool  is  large  enough, 
the  Navy  will  be  7*38  utiles  ahead,  on  the  average,  for  each  man  assigned 
using  the  test.  This  figure  should  of  course  be  reduced  by  the  cost  of 
testing.  This  cost  will  be  ignored  here  because  it  is  negligible  per 
man  in  the  Navy  setting.  (Testing  takes  one  day  out  of  a  recruit's 
schedule,  and  four  men  administer  a  test  battery  to  500  recruits  per 
day.) 

Since  N,  n^,  and  n^  are  not  known  when  the  test  to  be  evaluated 
has  been  operational  for  some  time,  it  will  help  to  express  the  equation 
for  My  in  terms  of  the  a  priori  probability,  p,  estimated  from  previous 
research.  An  equivalent  equation  is 

M  -  pug  ♦  (1  -  p)u?  (15) 

Unlike  the  method  presented  in  the  previous  section,  thi6 
method  does  not  consider  the  cost  of  rejecting  a  person  who  would 
have  succeeded  or  the  value  of  correctly  rejecting  a  fail.  Gy  only 
reflects  the  gain  per  selectee  over  chance  prediction.  Therefore,  it 
is  to  be  expected  that  Gy  will  provide  a  lover  estimate  of  the  utility 
of  selection  tests  than  will  U^.  Data  bearing  on  this  point  will  be 
found  in  Chapter  VII. 

Gy  and  U^,  can  be  compared  directly  (mathematically)  by  making 

assumptions  regarding  the  relative  size  of  the  utilities  in  the  two 
methods.  The  most  obvious  is  that  Ug  ■  Ug  -  and  U?  »  U^  -  U^. 
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When  this  is  true  *  (n^  +  n^G^.  However,  these  assumptions  are 
very  restrictive  and  will  be  true  only  rarely.  They  are  not  true  in 
the  empirical  situation  under  study.  Since  Ug  ■  U^,  can  equal 

only  when  is  zero.  Also,  since  one  assumption  underlying 
the  decision- theoretic  approach  is  that  >  (see  Chapter  III),  U^, 
can  equal  only  when  Uj,  is  negative,  which  is  not  true  in  the 

empirical  situation  under  investigation. 

Another  method  that  might  appeal  to  some  readers  is  to  weight 
the  cell  frequencies  by  the  corresponding  utilities  and  compute  the 
phi  coefficient.  This  method  would  have  the  following  drawbacks: 

(l)  n^  and  are  often  not  known,  (2)  determining  and  would 
probably  be  more  difficult  than  determining  (see  the  following 

chapter),  and  (3)  the  resultant  coefficient  would  seem  (to  the  writer) 
to  be  very  difficult  to  interpret. 
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CHAPTER  V 


MEASUREMENT  OF  VALUES  INHERENT  IN  TEST  EVALUATION 

Both  of  the  approaches  presented  in  the  preceding  chapter — one 
using  a  payoff  matrix  and  the  other  a  converted  criterion  scale — require 
quantitative  measurement  of  value.  The  relative  values  of  the  four 
decision-outcome  combinations  must  be  determined  in  the  first  approach. 
The  value  of  a  satisfactory  assignee  relative  to  an  unsatisfactory  one 
must  be  determined  in  the  second  approach.  In  both  cases,  of  central 
importance  is  the  value  of  obtaining  a  satisfactory  person  for  the 
assignment — U_  or  U  .  It  can  be  thought  of  as  the  need  for  a  satis- 
factory  assignee.  This  value  can  be  made  more  meaningful  to  the  insti¬ 
tution  using  the  test  by  scaling  the  job  areas  on  need— the  need  for  a 
satisfactory  assignee.  The  relative  need  for  additional  satisfactory 
persons  in  the  Job  areas  can  in  this  way  be  determined  and  expressed 
quantitatively.  Possible  ways  of  scaling  the  Job  areas  on  need  are 
described.  Hov  the  criterion  can  be  converted  once  this  scale  is  ob¬ 
tained  is  shorn. 

The  specific  situation  in  terms  of  vhich  these  methods  wore 
explored,  is  that  of  recruit  classification  in  the  U.  S.  Navy.  Selec¬ 
tion  tests  are  used  on  which  cutoffs  are  established.  If  a  recruit 
receives  a  score  above  the  cutoff  he  may  if  he  wishes  go  to  the  school 
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for  which  the  test  is  a  selector  (subject  to  quota  restrictions);  if 
he  receives  a  score  below  the  cutoff  he  will  not  be  sent  to  that  school. 

The  criterion  against  which  selection  tests  are  currently  validated  i6 
school  grade.  The  methods  described  below  are  presented  in  terms  of 

the  dichotomous  criterion,  graduate -fail,  which  is  based  on  school 
grade.  The  continuum  on  which  job  areas  were  scaled  is  therefore  the 
utility  of  school  graduates  to  the  operational  Navy,  or,  the  need  for 
school  graduates  in  the  corresponding  Job  areas. 

It  might  be  worth  mentioning  at  this  point  that  a  side  benefit 
of  this  scaling  process  is  that  the  scale  values  are  vitally  needed  for 
optimal  classification  of  recruits  to  schools  and  hence  to  Job  areas. 
Optimal  classification  is  not  possible  without  a  measure  of  need  across 
Job  areas.  The  same  is  true  regarding  Job  applicants  in  other  applied 
situations. 

Scaling  Job  Areas  on  Need 

Two  methods  were  used,  one  "indirect"  method  (probability 
comparison)  which  involves  inferring  values  from  choices  made  by 
judges,  and  one  "direct"  method  (magnitude  estimation)  which  requires 
each  Judge  to  estimate  need  in  each  Job  category.  The  methods  are 
designed  for  different  types  of  Judges,  namely,  classification  inter¬ 
viewers  and  area  personnel  planners.  The  indirect  method  was  developed 

by  the  author  of  this  study.  He  knows  of  no  similar  method  in  the 
scaling  literature. 
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Probability  Comparison 

Eleven  classification  interviewers  were  asked  to  indicate  how 
they  would  classify  imaginary  recruits  with  certain  probabilities  of 
success  in  Navy  schools.  A  questionnaire  (see  Appendix  C)  was  con¬ 
structed  containing  items  like  the  following  for  each  pair  of  schools: 

To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 

School  A  School  B 


(a) 

80 * 

and 

60* 

(b) 

80* 

and 

70*  _ 

(c) 

80* 

and 

80* 

(<0 

80* 

and 

90* 

(e) 

80* 

':nd 

95*  _ 

In  this  way  each  respondent's  indifference  point,  for  each  pair  of 
schools  was  determined.  This  is  a  point  in  the  probability  space 
where  the  respondent  is  indifferent  as  to  the  assignment  of  recruits 
to  one  school  or  the  other.  Its  coordinates  are  assumed  to  be  the  mid¬ 
points  of  the  intervals  where  the  respondent's  marks  change  columns. 
(The  items  should  be  so  constructed  to  ensure  that  a  crossover  always 
occurs . ) 

Establishment  of  the  indifference  point  for  a  pair  of  schools 
leads  to  the  following  equation: 

PGA  +  (1  -  P)FA  -  qCg  +  (1  -  q)FB. 

Here  £  is  the  probability  that  the  recruit  would  graduate  in  school 


A,  q  is  the  graduation  probability  in  School  B  that  leads  to  indifference, 


Ga  and  Gg  are  the  subjective  values  of  the  recruit  graduating  in  schools 
A  and  B,  respectively,  and  F^  and  F^  are  the  values  of  the  recruit  fail¬ 
ing  in  schools  A  and  B,  respectively. 

One  restraint  must  be  placed  on  the  above  equation  in  order  to 

solve  for  G  and  G  .  It  was  assumed  that  F  •  F  ,  i.e.,  that  in  matting 
A  C  A  B 

his  choices  the  respondent  considered  the  two  events,  failing  in  school 
A  and  failing  in  school  B,  equally  bad.  One  point  on  the  scale  was 
established  by  making  the  following  restriction: 

FA  -  fb  -  0. 

The  other  arbitrary  point  on  the  scale  was  set  by  choosing  a  value  for 

either  G.  or  G,.  If  G.  is  set  equal  to  100  the  equation  becomes 
ABA 

p(lOO)  -  qGB  . 

If  the  point  of  indifference  chosen  by  the  respondent  i6  defined  by 

the  probabilities  .8  and  .6,  G  can  be  determined  thusly: 

B 

.8(100)  -  .6Gn. 

D 

Gg  -  80/. 6  -  133. 

With  10  schools  there  are  45  possible  pairs  but  only  nine  are 
necessary  to  scale  them.  Additional  one6  were  presented  in  order  to 
obtain  stable  scale  values.  The  scale  values  of  all  can  be  computed 
once  a  single  one  is  arbitrarily  6et. 

The  scaling  questionnaire  is  presented  in  Appendix  C.  The 


percentage  value  for  the  school  presented  on  the  left  in  each  question 
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was  set  by  the  writer  at  the  level  that  he  thought  would  seem  reason¬ 
able  to  the  respondents.  The  percentage  values  for  the  school  on  the 
right  were  chosen  to  constitute  adequate  range  of  choices  around  the 
percentage  value  for  the  school  on  the  left  to  ensure  that  a  cross¬ 
over  would  occur.  As  nearly  as  possible,  the  schools  appear  on  the 
left  an  equal  number  of  times. 

There  are  24  questions  in  the  questionnaire,  each  with  a  unique 
pair  of  schools.  Pairs  were  chosen  in  the  following  way:  the  schools 
were  subjectively  ranked  by  the  writer  in  terms  of  the  need  for  addi¬ 
tional  men  in  the  Job  areas  corresponding  to  the  schools;  the  nine 
adjacent  pairs  were  used;  a  wide  variety  of  more  divergent  pairs  were 
chosen  in  such  a  way  that  the  schools  appear  roughly  the  same  number 
of  times. 

Eleven  indifference  points  were  obtained  for  each  question--one 
from  each  respondent.  In  the  rare  instance  where  all  the  marks  were  in 
one  column,  the  indifference  point  was  assumed  to  be  the  point  repre¬ 
sented  at  the  low  end  of  the  second  column. 

The  data  are  presented  in  Table  6.  The  mean  indifference  point 
for  each  question  in  the  questionnaire  is  presented  as  is  the  average 
deviation.  (Standard  deviations  were  not  used  because  the  distributions 
are  somewhat  truncated  at  the  upper  end  and  a  few  extrane  scores  occur 
at  the  other  end.)  The  indifference  points  in  column  (4)  are  not 
directly  comparable  since  they  represent  responses  made  relative  to 
various  dissimilar  chances  of  success:  those  presented  in  column  (3). 
The  indifference  points  were  made  comparable  by  dividing  the  chance  of 


TABLE  6 


mean  indlfferen:e  POINTS,  AVERAGE  DEVIATIONS,  AND  THE 
RATIOS  USED  IN  CALCULATING  THE  SCALE  VALUES  FOR 
T;{E  PROBABILITY  COMPARISON  SCALING  METHOD 


Question 

(1) 

School 
Pair  ): 

(2) 

Chance  of 
Success  in 
pirst  School 
(3) 

Mean 

Indifference 

Point 

(4) 

Ave rage 
Deviation 
(5) 

Preference 

Ratio* 

(6) 

1 

RM— EN 

70 

87 

5.91 

.605 

2 

PC--YN 

90 

79 

8.90 

1.139 

3 

RM--HM 

70 

85 

4.18 

.824 

4 

DK— HM 

80 

70 

6.82 

1.114 

5 

MM- -SO 

90 

66 

6.20 

1.324 

6 

SK— EN 

90 

85 

3-54 

1.059 

7 

EN— MM 

80 

76 

4.00 

1.053 

8 

PC— EN 

90 

81 

7.09 

1.111 

9 

ET—  RM 

70 

67 

4.54 

1.045 

10 

SO— ET 

70 

76 

6.64 

.921 

11 

HM— SK 

80 

85 

2.00 

•  941 

12 

EN— DK 

80 

84 

1-73 

.952 

13 

SK— PC 

90 

91 

3-27 

•  989 

14 

DK--ET 

80 

66 

5-18 

1.212 

13 

DK— YN 

80 

73 

5-64 

I.O96 

lb 

ET— HM 

70 

85 

5.73 

.824 

17 

MM— SK 

90 

90 

4.64 

1.000 

16 

HM— SO 

80 

67 

4.82 

1.194 

19 

YN— SK 

80 

84 

4.45 

•  952 

20 

SO— RM 

70 

71 

6.91 

•  935 

21 

HM— YN 

80 

86 

4.64 

•930 

22 

RM— DK 

70 

91 

3-73 

.769 

23 

YN — EN 

80 

83 

5.82 

•  964 

24 

PC— MM 

90 

77 

8.18 

I.I69 

The  entry  in  column  (3)  divided  by  the  entry  in  column  (4). 
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success  in  the  school  presented  on  the  left  in  each  item  by  the  mean 
indifference  point  for  that  item,  lhese  ratios  are  presented  in 
column  (6)  of  Table  6.  The  product  of  a  ratio  and  the  scale  value  of 
the  first  school  yields  the  scale  value  of  the  second  school.  I’slng 
these  ratios  and  an  arbitrarily  assigned  number  for  the  value  of  a 
graduate  from  one  of  the  schools  it  is  possible  to  compute  scale  values 
for  all  of  the  schools.  This  was  done  several  times  using  different 
schools  as  the  arbitrary  base.  These  computations  showed  that  graduates 
of  SO  school  are  worth  more  to  the  Navy  than  graduates  of  any  other  of 

the  ten  schools.  For  the  computation  of  the  final  scale  values  the 
value  of  100  was  assigned  to  SO  to  represent  the  upper  end  of  the  need 
scale . 

The  order  of  computation  was  as  follows:  Scale  values  were 
first  computed  for  schools  in  questions  in  which  SO  appeared  on  the 
left.  Then,  order  of  computation  was  dictated  by  the  order  in  which 
these  scale  values  were  obtained.  That  is,  as  a  scale  value  for  a 
school  was  obtained,  this  value  was  used  with  the  questions  in  which 
that  school  appeared  on  the  left.  This  procedure  was  followed  until 
all  the  questions  had  been  used. 

Table  7  contains  the  scale  values  obtained  from  the  twenty-four 
questions.  In  each  case  the  scale  value  was  computed  by  multiplying 
the  preference  ratio--column  (6)  of  Table  6--for  a  question  by  an 
already  computed  (or  assigned,  for  SO)  scale  value.  Table  ^  also  con¬ 
tains  the  mean  scale  values  for  the  ten  schools  under  study.  These  are 


tr.o  relative  utilities  of  sen col  graduates  as  deteminea  by  the  prob 
aiility  comparison  scaling  method. 

TABLE  7 


iCALE  VALUES  AND  RELATIVE  UTILITIES  OF  SCHOOL  GRADUATES 
AS  SI  ALE  VALUES  OBTAINED  THROUGH  THE  PROBABILITY 
COMPARISON  SCALING  METHOD 


School 


Scale  Values 
from  Individual 
Questions 

(100.0) 

90.6 

110.5 

)?8.5 

96.2 

92.1 

91.8 

83-5 

82.6 

75-9 

81.2 

84.4 

70.6 

83.O 

80.4 

75-8 

75.5 

79-3 

75.6 

68.0 

71.4 

67.2 

83.5 

70.6 

71.4 

Utilities 
(Mean  Scale 
Values) 


100.4 

97*4 

92.0 

83-0 


Magnitude  Estimation 

In  this  method  nine  area  personnel  planners  were  asked  to  scale 
the  Job  areas  on  need  in  a  direct  manner.  ftmt  is,  they  were  asked  to 
assign  numbers  to  the  Job  areas  in  accordance  with  the  need  for  additional 
men  in  the  Job  areas.  The  instructions  stressed  the  fact  that  the  num- 
uers  should  be  chosen  in  relation  to  each  other.  For  example,  if  the 


50 


need  in  one  job  area  is  just  half  the  need  in  another  Job  area  the  num¬ 
bers  assigned  to  the  former  should  be  just  half  the  number  assigned  to 
the  latter.  The  scaling-  questionnaire  is  presented  in  Appendix  D. 

Table  8  describes  the  data  and  summary  statistics.  The  utili¬ 
ties  in  Table  8  have  larger  average  deviations  than  those  obtained 
through  the  probability  comparison  method.  However,  they  also  have  a 
greater  range.  The  rank-order  correlation  of  the  means  is  .84. 

TABLE  8 

NEED  RATINGS,  MEDIANS,  MEANS,  AND  AVERAGE  DEVIATIONS  ON  TEN 
JOB  AREAS  OBTAINED  FROM  NINE  AREA  PERSONNEL  PLANNERS 


Job  Respondents 

Median 

Mean 

Average 

Areas  ABC  DEFGH 

I  Utility 

Utility 

Deviation 

SO 

100 

90 

100 

10 

90 

95 

100 

100 

100 

100 

97-2 

3-7 

ET 

90 

100 

100 

95 

100 

100 

95 

90 

90 

95 

95-6 

4.0 

RM 

85 

80 

95 

90 

75 

80 

90 

90 

80 

90 

85-0 

5.6 

YN 

70 

70 

75 

50 

60 

70 

55 

70 

65 

70 

65.0 

6.7 

SK 

73 

50 

50 

6o 

55 

40 

6o 

65 

60 

60 

57-0 

6.0 

HM 

50 

30 

80 

55 

70 

50 

6o 

70 

50 

55 

57.2 

11.8 

MM 

75 

20 

50 

60 

40 

50 

65 

50 

45 

50 

51.7 

11.8 

DK 

65 

10 

50 

30 

20 

40 

30 

60 

40 

40 

38.3 

12.4 

EN 

20 

40 

50 

30 

25 

30 

40 

30 

55 

30 

35.5 

7-0 

PC 

10 

6o 

50 

20 

30 

50 

4o 

20 

30 

30 

34.4 

13.8 

Conversion  of  the  Graduate-Fail  Criterion  to  a  Utility  Scale 

The  utility  of  a  gradual  was  set  at  the  utility  of  a  satisfac¬ 


tory  assignee  to  the  correspond ng  job  area--the  Job  areas  for  which 
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the  scnocl  trains  recruits.  Then,  the  utility  of  a  "fail"  was  deter¬ 
mined  in  relation  to  this  value.  Since  the  majority  of  "fails"  are 
sent  to  the  fleet  for  on-the-job  training  in  the  corresponding  job, 
and  since  these  who  conduct  this  training  also  supervise  graduates, 
the  value  of  the  average  "fail"  was  determined  by  ashing  the  Navy  per¬ 
sonnel  conducting  this  on-the-job  training.  The  scaling  method  was 
magnitude  estimation.  That  is,  the  supervisors  were  asked  to  assume 
that  the  average  graduate  is  worth  $10,000  to  the  Navy  and  to  indicate 
the  relative  value  of  a  failure.  This  was  done  for  each  Job  area. 

The  scaling  questionnaire  is  presented  in  Appendix  B. 

In  this  way  the  utility  of  a  "fail"  relative  to  the  utility 
of  a  graduate  was  determined  for  each  school.  This  scale  constitutes 
a  new  crite-ion  against  which  to  validate  selection  tests  through  the 
utility  function  method.  Table  9  presents  the  results  for  the  schools 

used  In  the  research  presented  in  Chapter  VII.  The  means  of  the  super¬ 
visors'  responses  are  in  column  two,  and  the  number  of  supervisors  in 
column  three.  The  average  deviation  is  reported  rather  than  the 
standard  deviation  because  there  are  extreme  deviations,  which  when 
squared,  would  bias  estimation  of  the  standard  deviation.  U^,  was  set 

so  that  U„  is  to  U  as  Mean  Utility  of  a  Fall  is  to  10,000.  The  schools 
f  G 

are  described  in  Chapter  VII. 
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TABLE  9 


THE  UTILITY  OF  A  FAIL  RELATIVE 
TO  THE  UTILITY  OF  A  GRADUATE 


School 

Mean 
Utility 
of  a  Fail 

N 

Average 

Deviation 

V 

UF 

SO 

$4,864 

11 

3,008 

100 

49 

ET 

4,759 

29 

1,699 

95 

45 

RM 

5,809 

54 

1,814 

90 

52 

YH 

7,316 

60 

2,558 

70 

51 

SK 

6,380 

24 

2,214 

60 

38 

MM 

6,941 

76 

3,045 

50 

35 

EN 

5,886 

22 

2,502 

30 

14 

* These  Ere  the  median  utilities  presented  in  Table  8 
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CHAPTER  VI 

PAYOFF  MATRICES  AND  A  WAY  TO  DETERMINE  THQ4 

Previous  chapters  have  shovm  that  a  new  selection  test  evalu¬ 
ation  approach  is  needed  and  that  statistical  decision  theory  provides 

a  promising  model  for  this  problem.  However,  a  formidable  prerequi¬ 
site  to  using  this  approach  is  determining  the  utilities  in  payoff 
matrices.  The  present  chapter  is  devoted  to  this  task.  A  way  to 

reduce  it  to  a  more  manageable  form  is  explained.  The  payoff  matrices 
used  in  the  next  chapter  are  also  presented. 

A  payoff  matrix  is  a  rectangular  array  of  numbers  which  repre¬ 
sent  the  utilities  of  decision-outcome  combinations.  The  utilities 
express  the  gain  and/or  loss  to  the  institution  in  terns  of  which  the 
decision  was  made.  Thus,  they  express  the  desirability  of  the  con¬ 
sequences  of  decisions.  The  number  may  be  positive  or  negative  for 
any  particular  decision-outcome  combination.  If  it  is  positive  the 
gain  outweighs  the  loss;  while  if  it  is  negative  the  contrary  is  true. 

A  payoff  matrix  for  the  selection  situation  is  a  2  X  2  mat¬ 
rix  of  numbers  which  represent  the  relative  utilities  of  the  four 

decision-outcome  combinations.  See  for  example  Figure  7*  Thus,  for 
each  contingency  table  with  observations  in  each  cell,  a  payoff  mat¬ 
rix  is  needed  with  a  utility  in  each  cell.  A  particular  utility 
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pertains  to  each  observation  in  the  corresponding  cell  of  the  contin¬ 
gency  table,  indicating  its  net  desirability. 


Satisfactory 

Outcome 

Unsatisfactory 


Fig.  7-“-The  standard  payoff  matrix  for  a  dichotomous 
decision  and  a  dichotomous  outcome. 

Utility  of  a  Correct  Acceptance 

In  the  personnel  selection  situation  the  most  meaningful  element 
of  the  payoff  matrix  seems  to  be  Ug,  the  utility  of  a  correct  accept¬ 
ance.  It  can  be  taken  as  the  utility  of  obtaining  a  satisfactory  per¬ 
son  for  the  Job  or  position.  In  many  settings  this  utility  might  be 
expressed  in  dollars  through  cost  accounting  or  Job  evaluation  pro¬ 
cedures. 

In  other  settings  constriction  of  a  numerical  scale  by  scaling 
Job  areas  on  utility  is  more  efficient.  Uiis  is  particularly  true  in 
a  large  institution  where  many  Job  areas  are  to  be  evaluated  and  many 

predictors  are  used.  Even  though  the  resulting  utility  scale  will  be 
"unfamiliar"  as  compared  to  the  dollar  6cale,  it  will  have  relative 
meaning  across  Job  areas  and,  therefore,  tests.  Many  psychological 
scaling  techniques  are  potentially  useful  for  this  purpose.  Chapter  V 


U1 

U2 

U3 

U4 

Reject  Accept 
Decision 


presents  two  used  in  this  study. 
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Utility  of  an  Erroneous  Rejection 

Rejecting  a  person  who  would  have  been  satisfactory  may  or  may 
not  be  a  serious  error.  Whether  it  is  or  not  largely  depends  on  three 
factors:  (l)  the  need  for  a  satisfactory  assignee,  (2)  the  propor¬ 
tion  of  the  testee  population  that  would  be  satisfactory,  and  (3) 
the  proportion  of  the  testee  population  that  is  needed.  should 
therefore  be  some  function  of  U  ,  £  (the  a  priori  probability),  and 
the  proportion  needed.  The  following  rules  were  adopted  for  the 
situation  to  which  this  stuiy  pertains: 

1.  a  satisfactory  assignee  is  lost.  (-U,,) 

2.  the  loss  does  not  quite  balance  actually  obtaining 
a  satisfactory  assignee.  (-U2  +  a) 

j.  the  loss  decreases  as  p  increases. 

4.  the  loss  increases  as  the  proportion  needed  increases 
(not  to  be  confused  with  c,  the  selection  ratio). 

These  rules  provide  some  restriction  on  U^-  Tney  constitute  the  lever¬ 
age  the  writer  was  able  to  bring  to  bear  on  this  utility  in  the  situ¬ 
ation  under  study.  Tr.e  first  rule  is  true  because  each  time  this 
error  of  decision  occurs  a  rejected  person  who  would  have  been  satis¬ 
factory  is  in  fact  lost  as  far  as  the  assignment  is  concerned.  Rule 
two  modifies  rule  one.  It  was  adopted  because  the  cost  of  training 
an  acceptee  is  not  expended  on  a  rejectee.  Therefore,  the  institution 
loses  a  satisfactory  assignee  but  saves  on  training  costs  as  a  result 
of  each  decision  to  reject  a  person  who  would  have  succeeded. 
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Rules  three  and  four  were  also  adopted  on  logical  grounds. 

Rule  three  is  based  on  the  assumption  that  the  more  abundant  the 
persons  who  would  succeed,  the  less  the  loss  of  failing  to  identify 
them.  At  the  other  extreme,  if  the  test  is  used  to  identify  rare 
persons,  missing  one  would  be  considered  a  costly  error,  generally 
speaking.  As  with  rules  one  and  two,  rules  three  and  four  are  inter¬ 
dependent.  Rule  four  says  that  the  relationship  hypothesized  in  rule 
three  is  dependent  upon  the  proportion  needed.  For  example,  if  very 
few  would  succeed,  the  loss  of  rejecting  such  a  person  might  not  be 
great  If  even  fewer  are  needed. 

Following  these  rules  tl.o  following  was  adopted  as  an  arbi¬ 
trary  but  reasonable  expression  of  U^: 

Ux  -  p(l  -  q')U2  -  U2  (1) 

or 

U1  “  -  PC1  -  q*>]  (2) 

The  quantity  £*  is  the  proportion  needed  referred  to  in  rule  four 

above.  It  is  not  necessarily  equal  to  o,  the  selection  ratio,  since 

the  latter  is  determined  by  x  which  depends  upon  the  values  in  the 

c 

payoff  matrix.  In  the  Navy  situation  £'  is  the  proportion  of  in¬ 
ductees  needed  to  meet  and  maintain  personnel  requirements  in  the 
Job  area.  Compared  to  £  it  is  quite  small.  The  disparity  is  due  to 

quota  restrictions  and  to  the  fact  that  many  testees  choose  another 
school  or  fleet  duty. 

The  expression  1  -  p(l  -  q')  in  Equation  (2)  is  always  less  than 
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one  and  represents  the  absolute  size  of  relative  to  U2>  A  few 


examples  will 

show  how 

this 

expression 

relates 

tO  £ 

and 

a': 

p: 

.2 

.2 

.2 

.2 

•  5 

•  5  .5 

•5 

.8 

.8  .8  .8 

q': 

.1 

.2 

•  5 

.8 

.1 

•  2  O 

.8 

.1 

.2  .5  .8 

1  -  P(1  - 

q'): 

.82 

.81* 

.90 

.96 

•  55 

.60  .75 

.90 

.28 

.36  .60  .81* 

Utility  of  an  Erroneous  Acceptance  and 
the  Utility  of  a  Correct  Rejection 

The  two  remaining  quantities  of  the  payoff  matrix,  U^  and  U^, 
are  more  nebulous  in  most  situations  than  the  two  treated  above.  How¬ 
ever,  there  is  a  way  to  circumvent  direct  estimation. 

It  was  pointed  out  in  Chapter  IV  that  the  index  U^  is  inde¬ 
pendent  of  the  addition  of  a  constant  to  the  quantities  in  a  row  of 
the  payoff  matrix.  Appendix  A  presents  the  mathematical  proof.  It 
was  also  shown  in  Chapter  III  that  the  optimum  cutoff  is  the  point  on 
the  test  score  scale  where 


X(x) 


P(F)(Un  P-U,  J 


P(S)(U 


R iL 

A,S 


A,F 

U  )' 
R,S' 


(3) 


assuming  that  \(x)  is  a  monotonic  increasing  function  of  test  score, 
x,  and  that  p  >  \  p  ^A  S  >  ^R  S’  Fquati°n  (3)  was  demon¬ 
strated  in  Chapter  IV  (Equations  9,  10  and  11)  to  be  equivalent  to 


P(s|xt) 

’  u2  "  ui  ' 


(4) 


I 

V 
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Thus,  the  optimum  cutoff  is  the  point  on  the  te6t  score  6cale  where 


P(S)  <1^  -  uu)  . 

i  -  m "  (^2  -  * 


(5) 


This  point  may  be  called  the  "saddle point, "  a  term  used  in  game 
theory  to  denote  the  conditions  under  which  the  players  win  equal 
amounts  on  the  average.  In  terms  of  the  accept- reject  decision,  it 
is  the  point  on  the  te6t  score  scale  where  it  makes  no  difference 
whether  the  testee  is  accepted  or  rejected--the  expected  payoff  is 
the  same  in  either  case. 

It  is  readily  apparent  from  Equation  (5)  that  the  difference 
could  be  computed  if  an  estimate  of  P(S)  for  this  saddlepoint 
was  available,  after  the  remaining  elements,  and  U^,  have  been  esti¬ 
mated.  When  is  subtracted  from  the  quantities  in  the  bottom  row  of 
Figure  7  the  matrix  in  Figure  8  results. 


Outcome 


Succeed 

Fail 


U1 

°2 

0 

U*-U3 

Reject  Accept 
Decision 


Fig. 8. — A  modified  payoff  matrix  obtained  by  adding  -IL 
to  the  entries  in  the  bottom  row  of  Figure  7»  ^ 

Thus,  the  quantity  needed  for  the  calculation  of  is  simply  the  nega¬ 
tive  or  the  difference  in  Equation  (5). 

If  the  test  has  been  used  for  a  long  time  by  the  institution, 
in  a  fairly  stable  situation,  and  the  cutoff  has  been  set  on  a  trial- 
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and-error  basis  with  plenty  of  feedback  from  performance  criteria,  then 

the  established  cutoff  can  be  accepted  as  a  fairly  accurate  estimate 
of  the  optimum  one.  This  is  the  case  in  the  situation  to  which  this 
study  pertains.  Therefore,  the  present  cutoff  was  used,  the  prob¬ 


ability  of  a  graduate  at  the  cutoff  determined,  and  that  probability 
value  used  to  compute  *  U^.  If  the  above  mentioned  conditions  were 
not  the  case,  it  would  be  necessary  to  obtain  an  estimate  of  the  saddle- 
point.  One  appealing  way  would  be  to  ask  persons  in  responsible  posi¬ 
tions,  i.e.,  ones  capable  of  making  higher- level  decisions,  questions 
like  the  following:  "Would  you  assign  person  P  with  characteristics 
m,  n,  and  o,  to  position  A,  if  he  has  a  60^>  chance  of  success;  7<#; 

80jt;  90$?"  This  would  seem  to  be  a  very  meaningful  task  for  a  person 
familiar  with  the  current  success-ratio  and  needs  of  the  institution. 


Determining  the  Payoff  Matrices  for  this  Study 
Table  10  presents  the  utilities  in  the  payoff  matrices  for  the 
schools  under  study.  (One  school,  PC,  was  dropped  because  of  insuf¬ 
ficient  data.)  Statistics  used  in  determining  the  utilities  are  also 
given:  N,  the  number  of  students  in  the  sample  upon  which  the  other 
quantities  are  based;  r,  the  validity  coefficient  of  the  selection 
test,  obtained  prior  to  dichotoraization  and  corrected  for  restriction 
of  the  range;  £;  P{G|x^,  the  probability  of  a  graduate  at  the  cutoff; 
and  2'. 

Correction  of  r  for  restriction  of  the  range  was  necessary 
because  school  assignment  had  been  made  on  the  basis  of  one  or  more 
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k. 


of  the  test6  to  be  evaluated,  resulting  in  direct  restriction  of  teat 
sccre  range.  A  method  of  correcting  for  restriction  developed  by 
Lavley  (19**3 )  and  expanded  ly  Meredith  (1953),  was  applied  to  the 
data.  Data  from  an  unrestricted  sample  of  500  recruits  were  used  to 
obtain  the  base  values  of  the  test  intercorrelations,  means  and  stand¬ 
ard  deviations  needed  for  correcting  the  school  matrices.  Ttoe  tests 
and  schools  are  described  in  the  following  chapter. 

Each  corrected  r  was  calculated  by  computer  using  the  following 


formula : 


S  S  _1d  *r 
yx  xx  x  xx 


S  w  ♦  S  S  "V(r  -  d  -*s  d  "?)d  ?S  "S 
y  yx  xx  x  '  xx  x  xx  x  x  xx  yx 


■JZTE  -K 


(6) 


where 

d  ^  ■  a  vector  of  the  test  standard  deviations  based  on  an 
unrestri  t-u.  sample, 

r  -an  intercorrelation  matrix  of  the  tests  based  on  an 

XX 

unrestricted  sample, 

S  ■  the  variance- covariance  matrix  of  the  tests  for  the 
**  restricted  sample, 

S  ■  the  vector  of  validity  coefficients  for  the  restricted 
yx  sample. 


2 

S  ■  the  criterion  variance  for  the  restricted  sample. 

y 

The  denominator  of  Equation  (6)  is  an  estimate  of  the  popu¬ 
lation  standard  deviation  of  the  criterion.  An  estimate  of  the  popu- 


M  -  S  S  _1(M 
yx  xx  x 


*) 


latlon  mean  is 
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where  S  and  S  are  as  defined  above,  and  where 
yx  xx  ' 

»  a  vector  of  test  means  based  on  an  unrestricted  sample, 
x  •  the  vector  of  test  means  for  the  restricted  sample. 

These  estimates  of  the  population  means  and  standard  deviations 
of  the  criteria  were  used  in  determining  the  £'s  in  Table  10.  The 
graduate -fail  division  point  16  63  on  the  school  grade  scale.  The 
z- score  value  of  this  point  was  calculated  for  each  school  by  dividing 
the  difference  between  63  and  the  mean  by  the  standard  deviation.  A 
table  of  the  normal  distribution  revealed  the  proportion,  £,  above  the 
z- score  value. 

P(G|xc)  was  calculated  using  “success-ratio"  theory  as  presented 
by  Walker  (1937)*  The  success  ratio  is  the  probability  that  persons 
with  a  given  test  score  will  succeed,  'Rius,  it  is  the  number  success¬ 
ful  with  a  given  test  score,  divided  by  the  total  number  with  that  test 
score.  Walker  developed  a  "success  function"  based  on  the  assumptions 
of  normality  and  linearity  that  can  be  used  to  determine  the  theoretical 
success  ratio  at  any  particular  test  score.  The  success  function  is 

the  probability  function  with  mean  at  k/r,  k  being  the  z-score  value  of 

2 

£  on  the  criterion,  and  standard  deviation  equal  to  7(1  -  r  )/r.  These 
are  converted  to  units  of  the  test  score  scale  so  that  any  test  score 
of  Interest  can  be  located  on  the  success  function.  (The  new  mean 
corresponds  to  the  te6t  mean  plus  the  product  of  k/r  and  the  test 
standard  deviation.  The  new  standard  deviation  is  equal  to  the  product 
of  7(1  -  r^)/r  and  the  standard  deviation  of  the  test.) 
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TABLE  10 


PAYOFF  MATRIX  UTILITIES  AND  ANTECEDENT 
STATISTICS  FOR  NINE  SCHOOL  SAMPLES 


To  determine  the  success  ratio  at  a  particular  test  score,  the 
score  is  located  on  the  success  function,  its  z-score  is  computed,  and 
the  success  ratio  is  read  from  a  table  of  the  normal  distribution. 

The  success  function  for  each  school  vas  determined  using  the 
corrected  r  and  the  population  means  and  standard  deviations  for  the 
test  and  criterion.  Then  the  operational  cutoff  on  the  te6t  vas  located 
on  the  success  function.  P(c|xc)  vas  obtained  from  a  table  of  the  nor¬ 
mal  distribution  using  the  z-score  value  of  the  cutoff  on  the  success 
function.  See  Appendix  E. 
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The  proportion  of  recruits  needed,  was  obtained  from  an  un¬ 
published  report  by  the  U.  S.  Navy  Personnel  Research  Activity  at  San 
Diego  (1964).  It  is  based  on  the  number  of  men  who  must  enter  the  Job 
area  at  the  lowest  level  each  year  in  order  for  the  Job  area  to  contain 
the  required  number  of  Petty  Officers  in  the  future.  This  number  was 
then  divided  by  the  number  of  inductees  during  fiscal  1964. 

The  8  in  Table  10  are  those  obtained  througn  the  magnitude 
estimation  scaling  method.  They  are  the  median  utilities  presented 
1  The  U^'s  were  calculated  using  Equation  (l)  while  Equa¬ 
tion  (5)  was  used  for  calculating  t  quantities.  In  the  case 

of  HM,  -  U  is  infinity  because  jg  is  so  high  and  the  cutoff  is  so 
low  that,  theoretically,  no  fails  are  to  be  expected. 

As  can  be  seen  in  Table  10,  the  quantities  are  very 

large  relative  to  and  U^.  This  is  due  to  the  fact  that  P(G|*c) 
is  quite  large  for  most  of  the  schools.  It  is  not  due  to  the  size  of 
U^,  the  other  quantity  upon  which  is  based.  For  example,  if 

was  zero  in  the  case  of  SO,  would  still  be  very  larbe, 

namely  2h00.  The  assumption  that  the  present  cutoffs  are  optimum 
leads  through  Equation  (5)  bo  the  very  large  quantities.  To 

put  it  another  way,  use  of  the  present  cutoffs  implies  that  the  avoid¬ 
ance  of  failing  students  is  of  primary  importance  to  the  Navy. 
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CHAPTER  VII 

AN  EMPIRICAL  TRYOUT 

The  results  of  an  empirical  tryout  of  the  new  evaluation  index 
UT  are  presented  in  this  chapter.  TSie  indices  r,  E,  and  are  also 

presented  for  comparison.  The  tests  evaluated  are  the  ones  used  in 
selecting  Navy  recruits  for  technical  training.  The  criterion  is 
final  course  grade  in  the  correlational  analysis;  while  in  the  case  of 
U,p  it  is  dichotomized,  final  grade,  graduate-fail.  Final  grade  is 
assigned  by  school  instructors  through  a  differential  weighting  of 
the  individual  achievement  and  proficiency  tests  taken  during  the 
course  of  training. 


The  Schools  Sampled 

1.  Sonarman  (SO) .--This  is  a  l6-week  course.  Tne  curriculum 
consists  of  (l)  operation  sonar  equipment,  (2)  International  Morse 
Code  communications,  (3)  basic  electricity,  electronics  and  sonar 
equipment  circuitry,  (4)  cleaning  and  lubrication  of  sonar  equipment, 
and  (5)  use  of  equipment  for  testing  electronic  performance  of  sonar 
equipment. 

Tne  final  grade  is  based  on  four  practical  and  17  ”*1tten 
examinations.  The  latter  receive  00  per  cent  of  the  weight.  (This 


is  the  formal  weighting  system.  Tne  effective  weights  which  depend  on 
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variances  and  Intercorrelations  are  not  known.) 

The  a  priori  probability  (the  proportion  who  would  graduate 
if  selection  wa6  random)  has  been  estimated  in  previous  research  to 
be  .85*  The  cutoff  used  for  selecting  recruits  for  this  school  is 
GCT  +  ARI  ■  110  which  is  .52  standard  deviations  above  the  mean.  (See 
below  for  a  description  of  these  tests.) 

2.  Electronics  Technician  (BT) • --This  school  is  36  weeks  long. 

The  curriculum  covers  basic  electricity  and  electronics,  required 
mathematics,  and  maintenance  and  repair  of  communication  equipment. 

The  final  grad?  is  based  on  eight  practical  and  18  written 
examinations.  The  latter  receive  88  per  cent  of  the  weight. 

The  a  priori  probability,  p,  for  this  school  is  .56.  The 
present  cutoff  is  GCT  ♦  ARI  -  ETST  ■  170  which  is  .72  standard  devi¬ 
ations  above  the  mean. 

3.  Radioman  (RM. .--Thir  is  a  2A-week  school.  Its  curriculum 
consists  of  instruction  in  the  operation  of  radios,  teletypewriters 
and  voice  radio  equipment,  transmission  and  reception  of  messages  by 
International  Morse  Code,  basic  electricity  and  electronics,  operation 
and  maintenance  of  receiving  and  transmitting  equipment. 

Final  grade  is  based  on  39  examinations.  Approximately  one- 
half  of  these  are  written  and  one-half  are  practical.  Written  and 

practical  examinations  are  weighted  equally  in  arriving  at  final  grade. 

For  this  school  p  is  .68.  lhe  cutoff  is  GCT  +  ARI  •  100  which 
is  .15  standard  deviations  below  the  mean. 
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4.  Yeomen  (YN). — This  school. is  eight  weeks  long.  3he  curric¬ 
ulum  covers  clerical  duties,  including  typing,  filing,  operation  of 
duplicating  machine  equipment  and  general  office  work,  and  records  for 
courts-martial . 

Written  examinations  (7)  cover  00  per  cent  of  the  final  grade 
and  practical  examinations  (6)  make  up  the  remaining  20  per  cent. 

For  this  school  £  is  .97-  'Hie  cutoff  is  OCT  +  CI£R  ■  110 
which  is  .97  standard  deviations  above  the  mean. 

5-  Storekeepers  (SK).--7nis  is  a  12- week  course.  TCie  instruc¬ 
tion  covers  general  stores  supply  afloat,  clothing  and  small  Etores, 
ships  store,  provision,  repair  parts,  records  and  reports,  typing,  and 
practical  work  in  all  phases  of  supply  afloat. 

The  final  grade  is  based  on  one  practical  and  21  written 
examinations.  The  latter  receive  80  per  cent  of  the  weight. 

The  a  priori  probability,  £,  is  .66,  and  the  cutoff  is 
GOT  +  ARI  ■  105  which  is  .18  standard  deviations  above  the  meeux. 

6.  Machinist's  Mates  (MM) . — This  school  consists  of  12  weeks 
of  instruction  in  principles  of  main  propulsion  machinery  and  aux¬ 
iliaries  operation,  maintenance  and  repair;  handtools,  gauges  and 
instruments  as  used  in  operating,  checking,  adjusting  and  performing 
preventive  maintenance.  Auxiliary  machinery  covered  includes  re¬ 
frigeration  equipment,  evaporators,  pumps,  compressors,  heat  exchangers, 
and  emergency  electrical  generators. 

The  final  grade  is  based  on  12  practical  and  6l  written 
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examinations.  The  latter  receive  a  formal  weight  of  93  per  cent. 

For  this  school  £  is  .90.  The  cutoff  is  GCT  +  MECH  ■  105 
which  is  .29  standard  deviations  above  the  mean. 

7*  Englneman  (EN) .--This  is  a  12-week  school  for  the  training 
of  men  to  operate,  maintain,  and  repair  internal- combustion  engines. 

The  school  provides  for  study  and  work  experience  in  the  following 
areas:  (1 )  Mathematics,  blueprint  reading,  temperature  and  pressure 
instruments,  and  basic  electricity;  (2)  lhreadcutting,  pipefitting, 
soldering  and  use  of  hand  tools;  (3)  Theory,  construction,  and  opera¬ 
tion  of  diesel  and  gasoline  engines  and  their  associated  equipment; 

(4)  Auxiliaries  including  boilers,  distilling  plants,  air  compressors, 
pumps,  refrigeration,  and  air  conditioning;  (6)  Damage  control. 

The  final  grade  is  based  on  application  marxs  and  6l  written 
examinations,  the  latter  receiving  95  per  cent  of  the  weight. 

For  this  school  £  is  .80.  The  cutoff  is  ARI  +  MECH  ■  105 
which  is  .37  standard  deviations  above  the  mean. 

The  Selection  TestB 

1.  The  General  Classification  Test  ( GCT) .--This  is  a  100-item 
test  of  verbal  aptitude  consisting  of  sentence  completion  and  verbal 
analogy  items.  The  alternate  form  reliability  is  .93.  A  single  Navy 
Standard  Score,  having  a  mean  of  50  and  a  standard  deviation  of  10, 

was  used. 

2.  The  Arithmetic  Test  (ARl).--This  test  consists  of  two 
separately- timed  subtests.  A  20-itera  Arithmetic  Computation  subtest 
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provides  a  measure  of  speed  and  accuracy  in  performing  elementary  com¬ 
putations,  and  a  30*i^em  Arithmetic  Reasoning  subtest  provides  a  meas¬ 
ure  of  ability  to  solve  verbally  presented  quantitative  problems.  Only 
total  score  in  Navy  Standard  Score  was  used.  The  alternate  form  re¬ 
liability  of  this  test  is  .85. 

3.  The  Mechanical  Test  (MECH). — This  test  consists  of  two 
separately-timed  50- item  subtests:  Mechanical  Comprehension  and  Tool 
Knowledge.  Only  total  score  was  used;  and  it  was  expressed  in  Navy 
Standard  Score.  The  alternate  form  reliability  is  .86. 

4.  The  Clerical  Test  (CLER). — This  is  a  210-itera  highly  speeded 
test  of  number  matching.  The  subject  compares  two  adjacent  columns 

of  5-to  9-digit  numbers  and  indicates  whether  or  not  they  are  identical. 
A  total  score  in  Navy  Standard  Score  form  was  obtained  using  the  for¬ 
mula  Number  Right  minus  Number  Wrong.  The  alternate  form  reliability 
of  this  test  is  .77* 

5.  The  Electronics  Technician  Selection  Test  (ETST) .--This 
test  is  primarily  a  measure  of  achievement  and  experience  in  areas 
related  to  electronics  maintenance.  It  has  five  separately- scored 
subtests:  Mathematics  (20  items,  some  requiring  a  knowledge  of  algebra 
for  their  solution);  Science  (20  items,  primarily  physics);  Shop  Prac¬ 
tice  (10  items);  Electricity  (l>  items)  and  Radio  (15  items,  some  re¬ 
quiring  a  knowledge  of  electronic  circuitry).  Total  test  score  in 
Navy  Standard  Score  was  used.  The  alternate  form  reliability  of  this 
test  is  .69. 
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The  Results 

The  results  are  presented  in  Tables  11  and  12.  In  Table  11 
the  three  evaluation  indices  are  presented  along  with  pertinent  infor¬ 
mation  on  the  samples  and  schools.  The  a  priori  probabilities  are 
those  in  Table  10  in  Chapter  VI.  The  selection  ratio,  £,  was  obtained 
in  each  case  from  a  table  of  the  normal  distribution  using  the  ff- value 
of  the  cutoff  given  above.  N  is  the  total  sample  size  upon  which  the 
calculations  involved  in  the  correlational  approach  are  based.  Only 
n2  and  n^  were  used  in  calculating  Gy  and  U^,.  Gy  was  obtained  using 
Equations  (lj)  and  (15)  in  Chapter  IV,  the  utilities  in  Table  9,  and 
the  £  and  n's  in  Table  11.  vas  obtained  using  Equations  (4)  and 
(7)  in  Chapter  IV,  the  utilities  in  Table  10,  and  the  £  and  n's  in 
Table  11.  The  discrepancies  between  N  and  n£  +  n^  are  due  to  waivers, 
i.e.,  acceptees  whose  test  score  composite  did  not  exceed  the  cutoff. 

Gy  and  are  expressed  in  units  of  the  same  utility  scale, 
the  one  obtained  through  the  scaling  techniques  described  in  Chapter 
V.  However,  Gy  expresses  the  utiles  gained  for  each  man  selected  by 
the  tests  while  expresses  the  utility  of  the  tests  for  all  the 
accept- reject  decisions  which  led  to  this  sample  of  students  above  the 
cutting  6core. 


and  U„ 


are  maximum  values,  i.e.,  ones  which  would  be 


“Max  ^Max 

obtained  with  perfect  selection  (if  all  the  selectees  had  graduated). 

They  are  the  utility  function  and  decision- theoretic  approaches'  counter¬ 
parts  of  an  r  of  1.00.  When  pi  q  the  formulas  are 
G. 


U, 


Max 


U0  -  PU0  ♦  (1  -  p)uf 
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TABLE  11 

THE  THREE  EVALUATION  INDICES  AND  CLOSELY  RELATED 
STATISTICS  FOR  SAMPLES  FROM  SEVEN  NAVY  SCHOOLS 


School 

N 

1 

1 

n2 

n4 

CuMax 

Vx 

PG 

above  x 

c 

P(G) 

above  x 

c 

■ 

°u 

UT 

SO 

387 

.83 

.30 

289 

30 

.61 

2.85 

32212 

7.65 

139960 

.906 

•98 

ET 

446 

.56 

.24 

164 

19 

.63 

16.81 

22639 

22.00 

29631 

.896 

.86 

RM 

519 

.88 

•  56 

334 

40 

.55 

0.50 

4587 

4.56 

42187 

•893 

•  97 

YN 

417 

•  97 

•  17 

236 

4 

.45 

0.23 

24320 

0.57 

54720 

•  983 

•  99 

SK 

272 

.66 

.43 

169 

18 

.57 

0.96 

8180 

3-08 

2618O 

.904 

•  97 

MM 

554 

.90 

•  38 

318 

4 

.68 

1.31 

5»*511- 

1.50 

62243 

.988 

•  99 

EN 

214 

.8° 

•  35 

159 

8 

.63 

2.43 

9398 

3.20 

12358 

•  952 

•  97 

Notes: 


P_  above  x  : 

G  C 

P(G)  above  x^: 


sample  size. 

a  priori  probability  (or  base  rate), 
selection  ratio. 

number  graduating  above  the  cutoff  x^. 

number  failing  above  the  cutoff  x^. 

product-moment  correlation  coefficient  obtained  prior 
to  dichotomization  and  corrected  for  restriction  of 
range  using  the  Lavley  method  described  in  the  pre¬ 
ceding  chapter. 

test  evaluation  index  of  the  utility  function  approach, 
test  evaluation  index  of  the  decision- theoretic  approach, 
the  value  of  G^  that  would  have  been  obtained  had  the 
test  provided  perfect  selection. 

the  value  of  that  would  have  been  obtained  had  the 

test  provided  perfect  selection. 

the  proportion  graduating  above  the  cutoff  x  . 

c 

The  probability  of  a  graduate  above  x  obtained  from  the 
Taylor-RusseH tables,  or  P(G|x  >xfi). 
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TABLE  12 


THE  UTILITY  OF  SELECTION  TESTS  AS  ESTIMATED  BY  THE 
THREE  APPROACHES:  CORRELATIONAL,  UTILITY 
FUNCTION,  AND  DECISION-THEORETIC 


Proportion  Improvement 
Over  Chance  Prediction 

The  Index  Values  Expressed 
In  Number  of  Graduates 

School 

E 

VV 

Max 

iiyu. 
y  Vx 

[P(cf  -  p](n2  +  n^) 

Vz  4-  nk)/U0 

Vuo 

SO 

.21 

•37 

41.5 

9-1 

522.1 

ET 

•23 

.76 

54.9 

32.4 

236.3 

RM 

.16 

.11 

33-7 

2.1 

51. c 

YN 

.11 

.44 

.44 

4.8 

0-9 

347.4 

SX 

.18 

•31 

•  31 

20.6 

3-0 

136.3 

MM 

.2? 

.88 

.68 

26.3 

6-5 

1090.2 

EN 

•  23 

•7  6 

1 

•  76 

28.4 

13.6 

313  •; 

*P(G|x  >  xc) 


and 


UT  “  (»2+n4)(U  -U)-U  . 
'■Max  *  *  x 


These  were  derived  from  the  formulas  for  and  U^  by  raking  the  fol¬ 
lowing  substitutions  in  Equations  (13)  and  (4)  in  Chapter  IV: 


n^  ■  n^  +  n^  and  n^  *  0.  When  q  >  p  the  substitutions  would  be 

n  ■  pN  and  n,  ■  N(q-p)  yielding  different  formulas  for  G  and 
2  4  -  UMax 

U 

W 
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The  last  two  columns  of  Table  11  present  tne  obtained  and  theo¬ 
retical  proportions  graduating  above  the  cutoff.  P„  above  x  is 

G  c 

nj (n^  n^)  while  P(G)  above  x  1b  the  probability  that  a  randomly 

selected  person  above  the  cutting  score  will  graduate.  The  latter 
was  obtained  from  the  Taylor-Russell  tables  using  r,  £,  and  ^  for  each 
school. 

Tailing  SO  as  an  example,  the  validity  coefficient  i6  .61,  Gy 

is  2.65,  the  gain  in  utiles  for  each  man  selected  by  the  test,  and 

is  >2,212,  the  utility  of  the  test  for  the  accept- re ject  decisions 

which  led  to  this  sample  of  students. 

In  Table  12  the  results  are  presented  in  two  forms  which  make 

the  three  evaluation  indices  comparable.  They  pertain  to  utility  or 

practical  significance.  In  terms  of  the  proportion  improvement  over 

chance  prediction  the  utility  function  approach  and  the  decision- 

theoretic  approach  agree  precisely.  U-^U  and  G  VG  are  larger 

1  XMax  U  UMax 

than  E  in  all  but  one  of  the  seven  schools. 

In  the  last  three  columns  of  Table  12  the  three  evaluation  in¬ 
dices  are  expressed  in  terms  of  "number  of  graduates."  Only  in  the 

first  of  these  columns,  which  pertains  to  the  correlational  approach, 
can  the  quantities  be  tar.en  as  the  number  cf  graduates  actually  gained 
through  the  use  of  the  test.  This  is  the  Taylor-Russell  interpretation 
of  validity  coefficients.  In  the  other  two  columns  the  utility  index 
values  have  been  translated  into  "number  of  graduates"  by  dividing  them 
by  Uq,  the  utility  of  a  graduate  of  that  school.  For  example,  the  first 
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term  of  the  last  column  was  obtained  by  dividing  U^,  for  SO--32,212-- 
ey  10D,  the  utility  of  a  graduate  of  SO  school. 

Thus  each  quantity  in  these  three  columns  means  that  the  selec¬ 
tion  test  was  as  valuable  as  would  be  actually  adding  that  many  grad¬ 
uates  to  the  operational  Navy.  Any  differences  between  the  three 
quantities  in  a  given  row  must  be  due  to  differences  in  the  evaluation 
approaches  which  lie  behind  the  three  test  evaluation  indices.  As  can 
be  seen  there  are  large  differences.  The  approaches  lead  to  radically 
different  conclusions  regarding  the  utility  of  the  tests.  The  decision- 
theoretic  approach  demonstrates  that  the  tests  are  worth  much  more  than 
either  the  correlational  approach  or  the  Taylor-Russell  approach  would 
indicate.  In  the  case  of  SO,  the  Taylor-Russell  approach  indicates 
the  use  of  the  selection  tests  meant  a  gain  of  i*1.5  SO  graduates  for 
this  group  of  319- -n^  +  n^-- selectees.  The  utility  function  approach 
indicates  the  gain  was  equivalent  to  gaining  9*1  new  SO  graduates.  The 
decision- theoretic  approach  indicates  the  gain  was  equivalent  to  gaining 
322.1  new  SO  graduates. 

Some  of  the  quantities  in  the  last  column  of  Table  12  are 
quite  large,  indicating  the  tests  were  worth  much  more  than  would 
be  expected.  This  is  primarily  due  to  tne  fact  that  is  very 

large  for  these  schools.  (See  Table  10  in  Chapter  VI.)  Any  te6t  that 
reduces  the  number  of  erroneous  acceptees  in  such  a  situation  will  have 
high  utility.  In  the  case  of  KM  the  numerical  reduction  was  26.  As 
was  pointed  out  in  the  concluding  paragraph  of  Chapter  VI,  the  enormity 
of  the  quantities  is  due  to  the  fact  that  P(G|xc)  is  very  close 
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to  1.00  for  all  but  one  of  the  schools:  ET.  Since  these 
quantities  follow  mathematically  from  the  model  presented  in  Chapter 
III  and  the  assumption  that  the  present  cutoffs  are  optimum,  if  the 
model  is  appropriate  for  test-based  selection  decisions  these  results 
raarte  the  cutoffs  suspect. 

Since  the  utilities  end  marginal  probabilities  are  peculiar 
to  the  empirical  altuatioa,  care  should  be  exercised  la  generalising 


these  results 
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CHAPTER  VIII 

DISCUSSION 

In  this  chapter  six  questions  which  are  pertinent  to  this  study 

are  raised  and  discussed.  They  are: 

^ .  ~..juld  the  complete  payoff  matrix  for  selection  contain 
zeros  in  the  reject  column? 

2.  What  changes  does  a  valid  test  maxe  in  the  "chance"  2X2 
table  and  how  does  relate  to  them? 

5*  How  do  the  payoff  matrices  needed  for  meaningful  test 
evaluation  relate  to  those  needed  in  decision  making? 

4.  How  does  practical  significance  differ  from  statistical 
significance? 

How  valuable  is  UT  for  comparing  tests? 

6.  Should  the  U  scale  be  converted  to  a  dollar  scale? 

These  questions  are  discussed  in  order. 

1.  Should  the  complete  payoff  matrix 
for  selection  contain  zeros  in  the  re¬ 
ject  column? 

Cronbach  and  Gleser  (l^?7)  make  no  distinction  between  rejectees 
as  far  as  utility  is  concerned,  apparently  assuming  that  since  a  re¬ 
jectee  is  not  in  the  institution  the  decision  to  reject  him  can  have 
no  positive  or  negative  effect  on  the  institution.  However,  the 

evaluation  of  a  test  should  involve  consideration  of  the  consequences 
of  errors  and  correct  decisions.  If  persons  who  Know  the  needs  of  the 
institution  decide  that  rejecting  a  person  who  would  have  done  well  in 
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a  particular  assignment  is  a  loss  and  a  serious  error,  then  the  test 
should  be  evaluated  in  terms  of  erroneous  rejections;  and  values 
should  appear  in  that  cell  of  the  payoff  matrix.  In  other  words,  the 
answer  to  this  quest’  n  must  depend  on  the  empirical  situation.  There 
is  no  substitute  for  intra- Institutional  analysis  of  the  gains  and 
losses  resulting  from  decisions. 

There  are  however  ways  to  maxe  tuese  analyses  easier  and  to 
give  them  stability.  Chapter  III  presents  a  system  in  terms  of  the 
optimal  cutoff,  the  one  which  maximizes  payoff.  Prior  to  developing 
this  rationale  and  these  mathematical  relationships,  the  author  of 
this  dissertation  tried  many  logical  analyses  and  found  himself  de¬ 
veloping  many  diverse  payoff  matrices  which  would  have  led  to  quite 
different  conclusions  as  to  the  utility  of  selection  test6.  Thus, 
even  within  an  institution  the  determination  of  the  most  appropriate 
payoff  matrix  is  very  difficult  without  some  rationale  and  system  to 
guide  the  analysis. 

The  system  presented  in  Chapter  III  is  based  on  decision 
theory.  This  theory  assumes  that  decisions  should  be  made  in  such 
a  way  as  to  maximize  payoff.  When  faced  with  a  choice,  the  assump¬ 
tion  is  that  the  best  course  of  action  is  that  which  will,  on  the 
average,  lead  to  the  greatest  payoff.  The  decision  to  ’accept"  is 
made  only  when  the  expected  payoff  is  greater  than  the  expected  pay¬ 
off  for  ’reject."  The  "saddlepoint"  is  the  optimal  cutoff,  the  point 
on  the  test  score  scale  where  the  expected  payoffs  are  equal.  This 
theory  clearly  assumes  nonzero  quantities  in  the  reject  column.  The 
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"s addle point"  and  maximum  payoff  concepts  are  meaningless  otherwise. 
While  the  answer  to  the  question  heading  thiB  section  should  be  em¬ 
pirically  determined  in  every  case,  the  logic  of  decision  theory 
strongly  supports  the  use  of  complete  payoff  matrices. 


2.  What  changes  does  a  valid  test  mane 
it.  the  "chance"  2X2  table  and  how  does 
Uy  relate  to  them?- 


If  selection  was  random  a  "chance"  2X2  table  would  result. 
In  terms  of  a  table  of  this  nature  the  function  of  a  valid  test  is 
figuratively  to  shift  persons  from  one  cell  to  another.  In  Table 
13  the  arrows  represent  these  shifts. 


TABLE  13 

A  "CHAN:E"  2X2  TABLE  SHOWING  THE  FIGURATIVE 
SHIFTS  OF  FERSONS  A  VALID  TEST  WOULD  MAKE 


Outcome 


Graduate 

Fail 


23  - 

-*  12 

U2 

-  18 

Reject  Accept 


Decision 


The  payoff  matrix  indicates  Just  what  these  shifts  are  worth 
to  the  institution.  Consider  the  payoff  matrix  presented  in  Table  14. 
A  shift  of  a  person  in  the  top  row  of  the  corresponding  2X2  table  is 
worth  16  units  to  the  institution,  since  instead  of  suffering  a  10- 
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unit  loss  it  gains  8  units.  Similarly,  a  shift  in  the  bottom  row  is 
worth  11  units. 

If  the  obtained  2X2  table  presented  in  Table  15  is  compared 
with  the  "chance"  table  in  Table  13,  the  figurative  shifts  are  10  in 
the  top  row  and  10  in  the  bottom  row.  These  are  worth  180  +■  110  •  290 
units  to  the  institution.  This  is  also  the  value  of  U^,.  If  the  complete 
"chance"  and  obtained  2X2  tables  are  available,  this  procedure  can  be 
used  instead  of  the  formula  for  UT  given  in  Chapter  IV.  The  result 
would  be  the  same  in  every  case. 

TABLE  14 

A  HYPOTHETICAL  PAYOFF  MATRIX 

o 

Graduate 

Outcome 

Fail 


TABLE  15 
A  HYPOTHETICAL  CONTINGENCY  TABLE 

Graduate 

Outcome 

Fail 


18 

22 

52 

8 

Reject  Accept 
Decision 


-8 

10 

6 

-5 

Reject  Accept 
Decision 
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- .  How  do  the  payoff  matrices  needed  for 
raf  mingf  ul  test  evaluatlor,  relate  to  those 
r.ccuea  for  uecision  maxing? 

■Rie  two  previous  sections  maxe  obvious  the  Importance  of  the 
absolute  site  of  the  values  in  the  payoff  matrices  for  test  evaluation. 
Ar.d,  as  is  pointed  out  in  the  next  section,  these  values  must  have 
meaning  within  a  particular  institutional  setting.  To  put  it  suc¬ 
cinctly,  determining  the  utility  of  a  test  for  a  particular  decision 
requires  a  quantitative  estimate  of  the  gain  to  the  institution  using 
it.  This  estimate  is  directly  affected  by  the  absolute  size  of  the 
values  in  the  payoff  matrix. 

If  or.  the  other  hani,  the  payoff  matrix  is  to  be  used  only  for 
decision-maxing- -and  in  testing  this  means  determining  the  optimum 
cutoff--the  relative  size  of  the  values  in  the  payoff  matrix  is  all 
that  is  needed.  Pr jpcrtionally  equal  reductions  or  increases  will  not 
affect  the  outcome. 

The  values  in  the  payoff  matrices  in  this  study  were  given 
quantitative  and  institutional  meaning  by  scaling  the  Job  areas  on 
r.eed--by  determining  the  relative  need  for  more  men  in  the  Job  areas. 
These  values  appear  in  the  "correct  acceptance"  cell  of  the  payoff 
matrices.  The  other  values  in  the  payoff  matrices  were  determined 
ir.  relation  to  these  values.  The  quantitative  estimate  of  utility, 
namely  U^,  is  expressed  in  the  units  of  thi6  need  scale. 

t.  Hew  dees  practical  significance  differ 
from  statistical  significance? 
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Practical  significance  refers  to  value  or  utility  while  statis¬ 
tical  significance  refers  to  the  probable  stability  of  an  obtained 
statistic.  In  selection  the  statistical  significance  of  a  correlation 
coefficient  indicates  that  there  is  probably  a  reliable  association 
between  the  selection  test  and  the  criterion.  The  fact  of  statistical 
significance  has  no  implication  of  how  much  association  there  is  be¬ 
tween  these  factors.  It  simply  means  there  is  probably  some  associ¬ 
ation. 

Practical  significance,  on  the  other  hand,  corresponds  more 
closely  to  the  common  man's  concept  of  significance,  namely,  important 
and  valuable.  Instead  of  referring  to  an  abstract  level  of  confidence, 
e.g.,  55 it  refers  to  concrete  utility.  This  means  that  practical 
significance  refers  to  the  utility  of  the  selection  test  in  a  particu¬ 
lar  situation  as  well  as  for  a  particular  decision.  All  the  situational 
(institutional)  factors  which  are  affected  by  the  consequences  of  the 
decision  should  be  reflected.  This  contrasts  with  statistical  signi¬ 
ficance  of  validity  coefficients  which  is  divorced  from  the  situation 
except  as  it  relates  to  the  criterion.  The  end  result  is  an  estimate 
of  the  test's  utility  to  the  institution  employing  it  as  contrasted 
with  an  estimate  of  the  criterion  variance  accounted  for,  or  of  pre¬ 
dictive  efficiency. 

5.  Row  valuable  is  U  for  comparing 
tests? 

As  was  pointed  out  in  Chapter  II,  the  primary  statistic  being 
used  to  evaluate  selection  tests  is  the  correlation  coefficient.  Of 
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two  tests  with  equal  selection  ratios,  the  test  which  correlates  highest 
with  a  dichotomous  criterion  will  be  best  by  any  standard  (disregarding 
cost  of  testing).  This  is  because  there  is  but  one  degree  of  freedom 
in  a  2  X  2  table  with  fixed  marginals.  A  decrease  in  the  number  of  a 
particular  type  of  error  must  be  accompanied  by  an  equal  decrease  in 
the  other  type  of  error  as  well  as  an  equal  increase  in  the  other  two 
cells.  Thus,  with  any  payoff  matrix  a  decrease  in  the  number  of  errors, 
regardless  of  which  one,  will  increase  the  correlation  coefficient  as 
well  as  U_«  With  fixed  marginals,  any  change  in  the  2X2  table  will 
affect  both  statistics  in  the  same  way.  In  any  particular  selection 

situation,  such  as  selection  for  one  of  the  Navy  schools,  they  would 
lead  to  the  same  choice  of  test.  Although  this  was  not  investigated 
in  this  study,  it  is  lively  that  UT  would  be  superior  for  two  reasons: 
(1)  it  would  provide  a  quantitative  estimate  of  how  much  better  one 
test  is  than  another  in  a  particular  situation  and  (2)  this  quanti¬ 
tative  estimate  would  be  in  terms  of  a  utility  scale  having  broad 
meaning  for  a  particular  institution  and  to  which  the  above  aspects 
can  be  related.  Correlation  coefficients  on  the  other  hand  cannot 
claim  to  indicate  how  much  better  one  test  is  than  another  for  a 
particular  decision  and  institution  because  the  needs  and  costs,  gains 
and  losses,  peculiar  to  that  institution  are  not  reflected  in  them. 

6.  Should  the  U  scale  be  converted 
to  a  dollar  scale? 


Two  general  approaches  to  utility  analysis  seem  obvious.  Ihe 
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one  taaen  in  this  study  is  to  establish  a  purely  general  utility  scale. 
The  other  is  to  use  the  familiar  dollar  scale.  The  former  approach  was 
used  because  the  author  imagined  that  it  would  be  very  difficult  for 
the  respondents  of  the  questionnaires  to  place  the  Navy's  needs  on  the 
dollar  scale.  Intra- individual  conflict  and  hightened  inter- individual 
variation  seemed  likely. 

After  the  scale  values  on  the  general  scale  have  been  obtained 
it  would  probably  be  quite  easy  in  most  institutional  settings  to  make 
accurate  links  between  the  utility  scale  and  the  dollar  scale.  This 
might  be  done  through  cost-accounting  or  Judgemental  procedures.  It 
would  greatly  increase  the  meaningfulness  of  the  6cale  and  put  many 
intra-  and  inter-institutional  relationships  on  a  quantitative  basis. 


63 


CHAPTER  IX 

SUMMARY  AND  CONCLUSIONS 

The  correlational  approach  to  selection  test  evaluation  wa6 
examined  and  found  to  have  serious  limitations.  An  approach  based 
on  statistical  decision  theory  was  developed.  Two  new  methods  were 
presented,  one  called  the  utility  function  method  and  the  other  the 
decision- theoretic  method.  The  former  is  largely  based  on  Brogden'6 

wore.  and  involves  the  comparison  of  criterion  groups  in  terms  of  their 
utility  to  the  institution  using  the  selection  test  being  evaluated. 
The  decision-theoretic  method  is  based  on  statistical  decision  theory 
and  involves  the  construction  of  a  payoff  matrix  corresponding  to  the 
contingency  table  relating  the  test  to  the  criterion.  The  cell  fre¬ 
quencies  are  weighted  in  a  utility  equation  by  the  payoff  values  in 
the  corresponding  cells  of  the  payoff  matrix.  This  utility  equation 
represents  a  new  test  evaluation  index  which  directly  expresses  the 
utility  of  the  test  to  the  institution  using  it. 

Both  of  these  new  methods  require  the  measurement  of  values 
peculiar  to  the  institution  using  the  test.  The  utility  function 

method  requires  that  the  performance  criterion  be  translate:!  to  a 
utility  function;  while  the  decision- theoretic  method  requires  that 
a  payeff  matrix  be  developed  which  reflects  the  gains  and  losses  each 
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cell  observation  represents  to  the  Institution. 

The  three  approaches  (correlational,  utility  function,  and 
decision- theoretic)  were  co.T.pared  with  tests  used  to  select  students 
for  technical  schools  in  the  U.  S.  Navy.  Scaling  techniques  were 
developed  for  the  measurement  of  values  inherent  in  the  Navy  situation. 
Specifically,  the  graduate-fail  criterion  was  translated  to  a  utility 
scale  and  the  job  areas  were  scaled  on  need  (or  the  utility  of  grad¬ 
uates  to  the  Navy).  Using  scale  values  obtained  for  the  job  areas,  a 
payoff  matrix  was  constructed  for  each  school  on  the  assumption  that 
the  presently  used  test  cutoffs  are  optimal. 

The  three  approaches  led  io  quite  different  indications  re¬ 
garding  the  utility  of  the  selection  tests  evaluated.  The  two  new 
methods  agreed  in  terms  of  the  proportion  improvement  over  chance 
prediction  provided  by  the  tests  while  the  correlational  approach 
tended  to  underestimate  this  proportion.  In  terms  of  practical  sign¬ 
ificance  the  decision- theoretic  approach  lead  to  much  more  positive 
conclusions  regarding  the  tests  than  did  the  other  two  approaches. 

In  addition  to  the  above,  perhaps  the  following  conclusions 
can  be  drawn  from  this  study: 

(1)  Statistical  decision  theory  is  particularly  well  suited 
for  the  usual  test  evaluation  situation. 

(2)  Psychological  scaling  methods  provide  a  solution  for  the 
measurement  of  values  required  in  the  application  of  the  decision- 
theoretic  approach  to  test  evaluation. 
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(3)  Supplementation  of  correlational  analysis  of  tests  with 
decision-theoretic  analysis  is  likely  to  lead  to  -new  insights  into 
the  utility  of  tests  for  personnel  decisions. 
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APPENDIX  A.  ALTERATION  OF  THE  PAYOFF  MATRIX 

Theorem:  TCie  difference  U  -  U  is  independent  of  the  addition 

c 

of  any  constant  to  the  values  of  both  entries  in  a  row  of  the  payoff 
matrix  where 

U  -  n^  +  ngU2  ♦  n3U3  +  n^,  (1) 

U£  -  (p  -  pq)NU1  +  pqNUg  +  (l  -  p  -  q  +  pq)NU3  +  (q  -  pq)NU^,  (2) 

N  “  ni  +  n2  +  °3  +  *V 

and  where  the  contingency  matrix  is 

P 

1  -  p 


l  -  q  q 


and  tne  payoff  matrix  ia 
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Proof:  Consider  the  following  matrix: 


and  U  is 
c 

U.  -  (p  -  pq)B(Ux  -  k)  +  pqN(U2  -  k) 


+  (1  -  P  -  q  +  Pq)NU3  +  (q  -  pqjNU^  (U) 

Since  the  last  two  terms  of  Equations  (3)  and  (4)  are  the  same  as  the 
corresponding  term6  of  Equations  (l)  and  (2)  respectively,  proof  of 
the  theorem  involves  showing  that 

niUl  +  n2U2  ’  [(p  '  Pq)NUl  +  pqNU2  ]  “  ni(Ul  "  k)  +  n2(U2  "  k) 

-  [(P  -  PqjN^  -  k)  +  pqN(Ug  -  k)]. 

Simplifying 

nlUl  +  n2U2  '  pNUl  +  PqNUi  '  PqNU2  “  niUl  ‘  nik  +  n2U2  ‘  n2k  ’  pNUi 
+  pNk  +  pqNU^  -  pqNk  -  pqNUg  +  pqNk. 

Canceling  yields 


0  -  -n^k  -  n^k  +  pNk. 


(5) 


Now  since 


D1  "2 

p  ■  —  +  — 

*  N  N 


Equation  (5)  can  be  written 


0  -  -Y-  -  y  +  (^1  (i] 


NK 


0  -  -n^k  -  n^k  +  n^k  +  n2k 


0-0. 
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APPENDIX  B.  THE  QUESTIONNAIRE  USED 
IN  CONVERTING  THE  GRADUATE-FAIL 
CRITERION  TO  A  UTILITY  SCALE 


INTRODUCTION 


In  today's  technically  advancing  Navy,  personnel  policies  must 
be  Kept  up  to  date.  You  can  help  in  this  task  by  answering  the 
questions  which  raaKe  up  this  questionnaire.  This  information  will 
oe  considered  by  the  Chief  of  Naval  Personnel,  along  with  other  in¬ 
formation  obtained  from  other  sources,  in  revising  personnel  policies 
and  practices. 

The  questions  deal  with  certain  of  your  experiences  and  opinions 
regarding  Navy  training  in  your  own  rating.  The  answers  you  provide 
will  be  used  only  for  research  purposes  and  will  in  no  way  affect  you 
a6  an  individual.  Please  answer  all  questions  even  though  you  can  pro¬ 
vide  only  a  rough  guess  on  some. 

Answer  the  questions  on  your  own.  Do  not  discuss  them  with 
others.  Your  Judgment  is  important  to  this  research  and  to  the  Navy. 

A.  Identification  and  Background  Information 


1. 

Name 

Last 

First 

Middle 

2. 

Service  Number 

3- 

Pay  Grade 

4. 

Rating  (ET,  YN,  etc.)  _ 

5- 

Ship  or  Station 

Indicate  your 

attendance  at  schools 

for  your  present  rating: 

School 

Attended? 

If  yes, 

did  you  graduate? 

A-School 

Yes  No 

Yes 

No 

B-School 

Yes  No 

Yes 

No 

C-School 

Yes  No 

Yes 

No 

7*  How  long  have  you  been  in  your  present  rating? _ years. 

6.  How  much  of  the  above  time  were  you  engaged  in  the  duties  of 
your  present  rating? _ years. 


y.  Approximately  how  many  men  in  your  rating  have  you  supervised. 

for  an  extended  period,  say  3  months  or  more? _  (total 

number  during  your  career). 

10.  Approximately  how  many  of  those  you  supervised  were  grad¬ 
uates  of  the  A-School  for  your  rating?  _ . 

11.  Approximately  how  many  men  who  were  dropped  from  the  A- 

School  for  your  rating  because  of  failing  grades  have  you 
worKed  .ith,  supervised,  or  trained  on  the  job? _ . 

12.  Approximately  how  many  strikers  in  your  rating  who  had  no  A- 
School  training  have  you  worked  with,  supervised,  or  trained? 


15.  Have  you  been  an  instructor  for  your  rating  in  A-School _ 

B-School _ ;  C- School _ ? 

B.  Judgments  Regarding  Training  in  Your  Rating 

In  this  section  you  are  to  compare  graduates  and  dropouts  (fail¬ 
ures)  from  the  A-School  for  your  rating.  You  are  asr.ed  to  judge 
them  in  terms  of  their  value  to  the  Navy  during  their  first  en¬ 
listment.  Consider  their  contribution  to  the  efficiency  and 
capability  of  the  Navy. 

1.  Assuming  that  the  average  graduate  of  the  A-School  for  your 
rating  is  worth  $10,000  to  the  Navy  during  his  first  enlist¬ 
ment,  how  much  i_s  the  average  dropout  from  that  school  worth 
.■ho  receives  on-tne- job  training  in  your  rating?  (The  time 
period  to  be  considered  in  both  cases  is  the  4  years  of  their 
first  enlistment.) 

$ _ .00 

Notice  that  you  are  to  consider  only  some  dropouts,  namely, 
only  those  who  later  receive  on-the-job  training  in  your  rating. 
(For  the  purpose  of  this  questionnaire  assume  that  you  per¬ 
sonally  did  not  conduct  this  training.)  Try  to  estimate  the 
average  dropout's  over-all  value  to  the  Navy  within  your 
rating,  net  just  his  value  on  a  particular  piece  of  equipment 
or  on  a  subtasx  within  the  rating.  Use  the  $10,000  figure 
as  a  guide  or  standard. 

2.  Now  consider  those  persons  in  your  rating  who  never  had  any 
School  training--they  went  directly  to  the  fleet  after  recruit 
training  and  became  strikers  in  your  rating.  How  much  is  the 
average  non-school  striker  in  your  rating  worth  to  the  Navy 


*3 


during  his  first  enlistment?  As  before,  use  the  $10,000  figure 
for  graduates  as  a  guide  or  standard. 

$ 


.00 


/ 
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APPENDIX  C.  THE  QUESTIONNAIRE  USED  IN  THE 
PROBABILITY  COMPARISON  SCALING  METHOD 


CLASSIFICATION  INTERVIEWER  OPINION  SURVEY 


Name _ Billet 

Last  First  Initial 

Rate/Rank _ Years  in  present  billet _ 


Years  Service _ Years  experience  in  classification 


What  this  is  about 

In  today's  fist  changing  Navy,  personnel  policies  must  be  kept  up- 
to-date.  You  can  help  in  this  important  task  by  answering  the  ques¬ 
tions  below. 

This  questionnaire  deals  with  the  classification  of  recruits  for 
assignment  to  Class  "A"  schools.  Its  purpose  is  to  discover  what 
classification  decisions  you  would  make  in  a  series  of  artificial 
situations.  Your  responses  will  be  combined  with  those  of  other 
classifiers  in  an  attempt  to  discover  what  pattern  of  decisions  are 
made  by  a  group  of  experienced  classification  interviewers. 

This  questionnaire  is  being  given  only  for  research  purposes  at 
the  present  time.  No  participant  will  be  identified  by  name  or  in 
any  other  way  in  the  research  reports. 

PART  I 


Directions  for  Part  I 

Each  question  refers  to  a  Class  "A"  school  and  to  an  imaginary  re¬ 
cruit  who  is  to  be  classified. 

As  everyone  knows,  you  can  not  be  absolutely  sure  that  every  re¬ 
cruit  you  send  to  a  school  will  do  well.  You  no  doubt  attempt  to 
determine  each  recruit's  chances  of  success  in  various  schools  during 
the  interview.  Assume  in  each  question  below  that  you  have  decided 
on  the  basis  of  his  test  scores,  interests  and  experience  that  he  is 
best  suited  for  the  rating  to  which  the  school  corresponds. 

You  are  to  indicate  whether  you  would  send  him  to  that  school  or 
not  if  you  think  he  has  the  chance  of  success  stated  in  the  question; 
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"success"  means  graduating  without  being  set  back  or  singled  out  for 
an  undue  amount  of  tutoring. 

Your  judgements  will  probably  reflect  differences  in  school  quotas 
and  shortages  in  the  ratings.  Try  to  assume  a  stable  quota  situation, 
using  as  the  basis  for  your  judgements  the  average  situation  as  it 
existed  during  1963. 

Sample  question 


Would  you  send  him  to  Electronic 
Technician  school  if  you  think  he 
has 

Yes  or  No 

(a) 

a  50$  chance  of  success? 

/rto 

(b) 

a  60$  chance  of  success? 

/no 

(c) 

a  70$  chance  of  success? 

(d) 

an  80$  chance  of  success? 

(e) 

a  90$  chance  of  success? 

The  person  who  answers  this  question  in  this  way  would  not  send  a 
recruit  to  ET  school  if  he  believes  the  recruit  has  a  53$  or  a  60$ 
chance  of  success  in  that  school.  He  would  send  him  to  ET  school  if 
he  believes  the  recruit  has  at  least  a  70$  chance  of  success  in  that 
school. 

The  questions 

Please  answer  the  following  questions,  writing  'yes"  or  "no"  in 
each  blank  as  was  done  in  the  sample  question. 

1.  Would  you  send  him  to  Electronics 
Technician  school  if  you  think  he 


has 

(a) 

a 

50$  chance  of  success? 

(b) 

a 

60$ 

(c) 

a 

70$  . 

(d) 

an  00$  " 

Yes  or  No 


/ 


2.  Would  you  send  him  to  Storekeeper 

school  if  you  think  he  has  Yes  or  No 

(a)  a  50$  chance  of  success?  _ 

(b)  a  60$  .  . 

(c)  a  70*  "  "  "  _ 

(d)  an  80*  H  _ 

(e)  a  50 $  "  "  "  _ 

3-  Would  you  send  him  to  Radioman 

school  if  you  think  he  has  Yes  or  No 

(a)  a  50%  chance  of  success?  _ 

(b)  a  60*  "  "  "  _ 

(c)  a  70*  "  "  "  _ 

(d)  an  80*  "  "  "  _ 

(e)  a  90*  "  "  "  _ 

k.  Would  you  send  him  to  Postal  Clerk 

school  if  you  think  he  has  Yes  or  No 

(a)  a  50$  chance  of  success?  _ 

(b)  a  60*  "  "  "  _ 

(c)  a  70*  "  "  "  _ 

(d)  an  60*  "  "  "  _ 

(e)  a  90jt  "  "  "  _ 

5.  Would  you  send  him  to  Hospital 
Corpsman  school  if  you  think 

he  has  Yes  or  No 

(a)  a  50 $  chance  of  success?  _ 

(b)  a  60*  " 

(c)  a  70*  “ 

(d)  an  60$  "  " 

(e)  a  50$  "  "  " 


6.  would  you  send  him  to  Machinist '6 
Mate  school  if  you  think  he  has 

(a)  a  5O*  chance  of  success? 


(b)  a  60*  . 

(c)  a  70*  M 

(d)  an  60*  . 

(e)  a  90*  -  -  " 

7.  V.'ould  you  send  him  to  Disbursing 
Clerk  school  if  you  thin*  he  has 

(a)  a  60*  chance  of  success? 

(b)  a  60*  " 

(c)  a  70*  ” 

(d)  an  80*  " 

(e)  a  90*  . 

b.  Would  you  send  hvn  to  Sonarman 
school  if  you  think  he  has 

(a)  a  5  •*  chance  of  success 

(b)  a  60*  . 

(c)  a  70* . . 

(d)  an  60*  " 

(e)  a  90*  . 

9.  Would  you  send  him  to  Engineman 
school  if  you  thin*  he  has 

(a)  a  >0*  chance  of  success 

(b)  a  60*  '  " 

(c)  a  70*  " 

(d)  an  80^.  " 

(e)  a  90*  " 


10.  Would  you  send  him  to  Yeoman 
school  if  you  think  he  has 

(a)  &  50 *  chance  of  success? 

(b)  a  60*  " 

(c)  a  70*  " 

(d)  an  80*  " 

(e)  a  90*  "  "  " 


Yes  or  No 


PART  II 


Directions  for  Part  II 

Each  question  refers  to  two  Class  "A"  schools  and  to  an  imaginary 
recruit  who  must  be  sent  to  one  of  the  two  schools.  (Assume  that  you 
have  decided  on  the  basis  of  his  test  scores,  interests  and  experience 
that  he  should  be  sent  to  one  or  the  other  of  these  schools.) 

In  the  questions  below,  the  recruit's  chances  of  success  are 
stated.  You  are  to  indicate  your  preference  of  assignment  by  placing 
a  mark  in  one  of  the  two  spaces  in  each  line.  Since  your  preferences 
might  vary  with  changes  in  quotas  and  with  rhortages  in  the  ratings, 
take  the  average  conditions  during  1963  as  me  circumstances  for  your 
Jud@nents . 

Sample  question 

To  which  school  would  you  assign  a  recruit  if  you  thin*  his  chances 
of  success  are 

Yeoman  Electronics  Technician 


(a)  80*  / 

and 

60*  _ 

(b)  80* 

and 

70*  _ 

/ 

(c)  bo* 

and 

80*  _ 

y 

(J)  60* 

and 

y 

(e)  CO* 

and 

_ 

yb*  . 

y 

The  person  who  answers  this  question  in  this  way  believes  it  would 
be  better  to  assign  a  recruit  to  Yeoman  school  than  to  Electronics 
Technician  school  if  he  has  an  w >  chance  of  success  In  YN  school  and 
a  ou*  chance  of  success  in  LT  school-- in  other  «ords,  if  the  chances 


stated  in  (a)  are  true.  If  the  recruit's  chances  of  success  are  o0% 
for  YN  school  and  7o%  or  above  for  ET  school,  this  person  would  prefer 
to  assign  the  recruit  to  ET  school. 

The  questions 

Please  answer  the  following  questions  making  a  checK  in  one  of 
the  blanks  in  each  line  as  was  done  in  the  sample  question.  Take 
your  time,  resting  frequently  if  the  task  seems  difficult.  As  In 
Part  I,  "success"  means  graduating  without  being  set  back  or  singled 
out  for  a  lot  of  special  tutoring. 


1.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 

Radioman  Engineaan 


(a) 

70*  _ 

and 

60% 

(b) 

70*  _ 

and 

70* 

(c) 

70*  _ 

and 

60% 

(d) 

70*  _ 

and 

90% 

(  > 

70%  _ 

and 

55% 

(Be  sure  you  ade  5  marks--one  for  each  pair  of  percentages) 

2.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 

Postal  Clerk  Yeoman 


(a) 

90%  _ 

and 

60% 

(b) 

90%  _ 

and 

70% 

(c) 

90% 

and 

80% 

(d) 

90% 

and 

90% 

(e) 

90% 

and 

95% 

3.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 


Radioman 

Hospital  Corpsman 

M 

70% 

and 

60% 

(b) 

70% 

and 

70% 

(c) 

70% 

and 

60% 

(d) 

70% 

and 

90% 

(e) 

70% 

and 

95% 

100 


To  which  school  could  you  assign  a  recruit  if  you  thinK  his 


7- 


chances  of 

success 

are 

Disbursing  Clerr. 

Hospital  Corpsman 

(a) 

00* 

and 

fau* 

(b) 

60* 

and 

70* 

(c) 

00* 

and 

8o* 

(d) 

60* 

and 

90* 

(e) 

80* 

and 

95* 

To  which  school  would  you  assign  a 
chances  of  success  are 

recruit  if  you 

Machinist 's 

Mate 

Sonarman 

(a) 

90* 

and 

6o* 

(b) 

90* 

and 

70* 

(c) 

‘TO* 

and 

6o* 

(d) 

<0* 

and 

90* 

(e) 

90* 

and 

95* 

To  which  school  would  you  assign  a 
chances  of  success  are 

recruit  if  you 

( 

Storekeeper 

Engineman 

(a) 

90* 

and 

60* 

(b) 

90* 

and 

70* 

(c) 

90* 

and 

80* 

(d) 

90* 

and 

90* 

(e) 

90* 

and 

95* 

To  which  school  would  you  assign  a 
chances  of  success  are 

recruit  if  you 

Engineman 

Machinist’s  Mate 

(a) 

3o* 

and 

6c* 

(b) 

8o* 

and 

70* 

(c) 

ou* 

and 

8o* 

(d) 

co* 

and 

90* 

(e) 

dO* 

and 

9u* 
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8.  To  which  school  would  you  assign  a  recruit  if  you  thiru  his 


9- 


10. 


11. 


chances  of 

success  are 

Postal  Clerk 

Enginemar 

(a) 

90* 

and 

60* 

(b) 

*>* 

and 

70* 

(c) 

90* 

and 

80* 

(d) 

90* 

and 

90* 

(e) 

90* 

and 

95* 

To  which  school  would  you  assign 

a  recruit  if  you  thinK 

chances  of 

success  are 

Electronics  Technician 

Radioman 

(a) 

70* 

and 

60* 

(b) 

70* 

and 

70* 

(c) 

70* 

and 

80* 

(d) 

70* 

and 

90* 

(e) 

?o* 

and 

95* 

To  which  school  would  you  assign 

a  recruit  if  you  think 

chances  of 

success  are 

Sonarman 

Electronics  Technician 

(a) 

70* 

and 

60* 

(b) 

70* 

and 

70* 

(c) 

70* 

and 

60* 

(d) 

70* 

and 

90* 

(e) 

70* 

and 

95* 

To  which  school  would  you  assign 

a  recruit  if  you  thinK  ' 

chances  of 

success  are 

Hospital  Corpsman 

Storekeeper 

(a) 

60* 

and 

60* 

(b) 

80* 

and 

70* 

(c) 

SO* 

and 

80* 

(d) 

80* 

and 

90* 

(e) 

8o* 

and 

95* 
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12.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 

Engineman  Disbursing  Clerk 


(a) 

60% 

and 

60* 

(b) 

80  * 

and 

70* 

(c) 

00* 

and 

80* 

(d) 

80* 

and 

90* 

(e) 

00* 

and 

95* 

13.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 

Storekeeper  Postal  Clerk 


(a) 

90* 

and 

60* 

(b) 

90* 

and 

70* 

(c) 

90* 

and 

80* 

(d) 

94 

and 

90* 

(e) 

90* 

and 

95* 

14.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 


Disbursing  Clerk 

Electronics  Technician 

(a) 

50* 

and 

60* 

(b) 

00* 

and 

70* 

(c) 

80* 

and 

80* 

(i) 

do* 

and 

90* 

(e) 

80* 

and 

95* 

15.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 

Disbursing  Clerk  Yeoman 


(a) 

80* 

and 

60* 

(b) 

60* 

and 

70* 

(c) 

00* _ 

and 

80* 

(d) 

80* 

and 

90* 

(e) 

80* 

and 

95* 

Id.  To  which  school  would  you  assign  a  recruit  if  you  think  hiB 
chances  of  success  are 


Electronics 

Technician 

Hospital  Corpsman 

(a) 

70* 

and 

60 f> 

(b) 

70* 

and 

70* 

(c) 

70* 

and 

0 0* 

(d) 

70* 

and 

so* 

(e) 

70* 

and 

95* 

17-  To  which  school  would  you  assign  a  recruit  if  you  think  hiB 
chances  of  success  are 


Machinist's  Mate 

Storekeeper 

(a) 

90* 

and 

60* 

(b) 

so* 

and 

70* 

(c) 

90* 

and 

60* 

(d) 

90* 

and 

90* 

(e) 

90* 

and 

95* 

18.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 


Hospital  Corpsman 

Sonarman 

(a) 

60* 

and 

60* 

(b) 

80* 

and 

70* 

(c) 

80* 

and 

60* 

(d) 

80* 

and 

90* 

(e) 

80* 

and 

95* 

19.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 


(a) 

Yeoman 

00* 

and 

Storekeeper 

60* 

(b) 

80* 

and 

70* 

(c) 

80* 

and 

80* 

(d) 

80* 

and 

90* 

(e) 

80* 

and 

95* 

Iu4 


20.  To  which  school  would  you  assign  a  recruit  if  you  think  his 


chances  of 

success  are 

Sonarman 

Radioman 

(a) 

70$ 

and 

60% 

(b) 

70* 

and 

70 % 

(c) 

70* 

and 

80* 

(d) 

70* 

and 

90% 

(e) 

70* 

and 

95* 

21.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 

Hospital  Corpsman  Yeoman 


(a) 

80* 

and 

60* 

(b) 

80* 

and 

70* 

(c) 

60* 

and 

60* 

(d) 

60* 

and 

90* 

(e) 

80* 

and 

95* 

22.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 

Radioman  Disbursing  Clerk 


(a) 

70* 

and 

60* 

(b) 

70* 

and 

70* 

(c) 

70* 

and 

80* 

(d) 

70* 

and 

90* 

(e) 

70* 

and 

95* 

23.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 


(a) 

Yeoman 

80* 

and 

Engineman 

60* 

Co) 

80* 

and 

70* 

(c) 

80* 

and 

80* 

(d) 

60* 

and 

95* 

(e) 

80* 

and 

95* 

it>5 


24.  To  which  school  would  you  assign  a  recruit  if  you  think  his 
chances  of  success  are 

Postal  Clerk  Machinist’s  Mate 


(a) 

9# 

and 

bO* 

(b) 

9# 

and 

70* 

(c) 

90* _ 

and 

80* 

(<0 

90* 

and 

90* 

(e) 

90* 

and 

95* 
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APPENDIX  D.  THE  QUESTIONNAIRE  USED  IN  MEASURING 
THE  UTILITY  OF  GRADUATES  BY  THE  MAGNITUDE 
ESTIMATION  SCALING  METHOD 


Name _ Position _ Years  Service 

Last  First 

Years  in  present  position _ Rank/Rate _ 

Years  in  present  command _ 


Tne  Estimation  of  Manpower  Needs 
What  this  is  about 

As  you  know,  there  are  severe  shortages  of  personnel  in  cer¬ 
tain  ratings.  We  often  refer  to  these  ratings  as  "critical."  Other 
ratings  are  less  critical  since  the  supply  of  qualified  persons  in 
these  ratings  is  more  nearly  sufficient  to  meet  the  requirements  of 
the  Navy.  Still  other  ratings  have  enough  men  and  are  not  critical 
at  all. 


The  ratings  of  the  Navy  can  be  thought  of  as  lying  on  a  scale 
which  runs  from  non-critical  at  one  end  to  very  critical  at  the  other 
end.  A  rating's  position  cr.  this  scale  would  indicate  how  badly  the 
Navy  needs  more  men  in  that  rating. 

One  way  to  determine  how  critical  a  rating  is,  is  to  ask 
experts  to  place  the  ratings  on  a  numerical  scale.  "Expert"  is 
defined  as  someone  who  Knows  a  great  deal  about  the  personnel  needs 
of  the  Navy.  Since  you  are  involved  in  the  distribution  of  enlisted 
personnel,  you  are  an  expert  in  this  regard. 


This  is  a  research  effort 

You  will  be  asked  below  to  indicate  how  critical  you  believe 
certain  ratings  to  be.  We  want  your  own  opinion  so  do  not  discuss 
this  with  others  or  consult  other  estimates  of  personnel  needs.  Your 
answers  will  be  used  only  for  research,  and  will  be  held  confidential. 


Avoid  short-term  fluctuations 

Try  to  base  your  Judgements  on  an  extended  time  period  so  as 
to  avoid  short-term  fluctuations  in  the  need  for,  and  the  supply  of, 


lu7 


men  in  certain  ratings.  Use  the  calendar  year  1963  as  the  period  for 
your  judgements. 


The  task 

At  this  time  we  are  interested  in  just  10  ratings: 

Radioman  _  Hospitalman 

Disbursing  Clem  _  Sonarman  _ 

Electronics  Technician  _  Postal  Clerk  _ 

Yeoman  _  Storekeeper  _ 

Machinist's  Mate  _  Engineman  _ 

_  _ 0 _ (not  critical) 

First,  write  the  name  of  a  rating  which  you  believe  was  not  at  all 
critical  in  the  bottom  bland.  Do  not  use  any  of  the  10  ratings 
listed.  This  rating  should  represent  the  zero  point  of  the  scale-- 
a  rating  in  which  there  was  an  abundance  of  men. 

Next,  put  the  number  100  in  the  bland  at  the  right  of  the 
rating  that  you  believe  wan  the  most  critical  of  the  10  listed.  Now 
assign  numbers  between  zero  and  100  to  the  other  ratings  to  show  how 
critical  they  were  in  your  Judgment.  But  first,  understand  that  these 
numbers  should  be  chosen  in  relation  to  the  zero  and  the  100  which  you 
have  already  assigned  to  ratings.  Thus,  if  you  thin/,  that  one  of  the 
ratings  was  exactly  half  as  critical  as  the  one  that  you  chose  as  m06t 
critical,  assign  to  it  the  number  50;  or  if  you  think  it  was  one-fourth 
as  critical,  assign  to  it  the  number  25,  etc.  Write  the  number  chosen 
for  each  rating  on  the  list  in  the  bland  at  the  right  of  that  rating. 

In  this  way  you  will  be  placing  the  ratings  on  a  100-point  6C/vle. 


WORKTABLE  IN  THE  COMPUTATION  OF  THE  SUCCESS  FUNCTION  QUANTITIES 
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APPENDIX  E.  SUCCESS  FUNCTION  CALCULATIONS 
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Column  (11 )  entries  are  means  of  the  success  functions  expressed  in  units  of  the  test  scales 
Column  (12)  entries  are  G's  of  the  success  functions. 

Column  (13)  entries  are  ff's  of  the  success  functions  expressed  in  units  of  the  test  scales. 
Column  (15)  entries  are  success  function  z-scores  of  the  cutoffs. 

All  entries  are  population  values. 
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the  correlational  approach  to  selection  test  evaluation  was  examined  and  found  to 
have  serious  limitations.  An  approach  based  on  statistical  decision  theory  was  devel¬ 
oped.  Two  new  methods  were  presented,  one  called  the  utility  function  method  and  the 
other  the  decision-theoretic  method.  The  former  involves  the  comparison  of  criterion 
groups  in  terms  of  their  utility  to  the  institution  using  the  selection  test.  The 
decision-theoretic  method  is  based  on  statistical  decision  theory  and  involves  the  con¬ 
struction  of  a  payoff  matrix  corresponding  to  the  contingency  table  relating  the  test 
to  the  criterion.  The  cell  frequencies  are  weighted  in  a  utility  equation  by  the  pay¬ 
off  values  in  the  corresponding  cells  of  the  payoff  matrix.  This  utility  equation  rep¬ 
resents  a  new  test  evaluation  index  which  directly  expresses  the  utility  of  the  test  to 
the  institution  using  it.  Both  of  these  new  methods  require  the  measurement  of  values 
peculiar  to  the  institution  using  the  test.  The  utility  function  method  requires  that 
the  performance  criterion  be  translated  to  a  utility  function;  while  the  decision- 
theoretic  method  requires  that  a  payoff  matrix  be  developed  which  reflects  the  gains 
and  losses  each  cell  observation  represents  to  the  institution.  The  three  methods 
(correlational,  utility  function,  and  decision- theoretic)  were  compared  with  tests  used 
to  select  students  for  A-Schools  in  the  U.S.  Navy.  The  three  methods  led  to  quite  dif¬ 
ferent  indications  regarding  the  utility  of  the  selection  tests  evaluated.  The  two  new 
methods  agreed  in  terms  of  the  proportion  improvement  over  chance  prediction  provided 
by  the  tests  while  the  correlational  method  tended  to  underestimate  this  proportion. 

In  terms  of  practical  significance  the  decision-theoretic  method  lead  to  much  more 
positive  conclusions  regarding  the  tests  than  did  the  other  two  methods. 
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